GitHub
Airflow Backfill Tools
- Worked with a team of 3 to design and build the infrastructure that allows users to submit backfill requests to Airflow. Implemented the backfill workers that uses Apache Airflow backfill command to carry out the backfill job for extended periods, manages and retries failed jobs, and surfaces non-retryable errors to end users.
- Conducted mini user research to gather feedback and implemented the features that eases the adoption of the tools.
- Saved 5+ engineer hrs/week
Sage Bionetworks
Synapse-Repository-Services
The code base for Synapse.org server and its Java client.
Discussion Feature
- Enabled discussion between Synapse users and Synapse Support Team (Synapse Help Forum), between Dream Challenge participants and organizers including The Digital Mammography Dream Challenge , between Sage and data contributors including the Bridge - Lily project , and endless possibilities in every single projects created and shared by users.
- Learn more
Access Control Tool
- Enabled the Access and Compliance Team to manage the terms and requirements, to track, review, reject, and grant access to a group of users within Synapse.
- Streamlined the process to participate in Dream Challenges including the Parkinsons Disease Digital Biomarker DREAM Challenge , and the NIH Data Commons.
- Learn more
Refactor Java Client
- Rewrote the Synapse Java Client implementation to use SimpleHttpClient.
- Enabled other applications including the Bridge Exporter that is using the Synapse Java Client to programmatically restart its connection pool when an existing connection pool is saturated because of a connection leak or other reasons.
Synapse-Warehouse-Workers
Data collected from Synapse.org server are stored in S3. Synapse-Warehouse-Workers imports the captured snapshots into its partitioned MySQL database and pre-processes them, allowing its client to query against these data for reports.
SynapseWebClient
The web client of Synapse.org.
- UI for Discussion feature
- UI for Docker feature
SimpleHttpClient
A thin wrapper around Apache's HttpClient to provide a set of simple interfaces to be used for HTTP calls that only transfer json objects.
synapser
The R client of Synapse.org.
The Build System
- Proposed the plan for how to develop, build, and release synapser.
- Organized the builds on Jenkins and wrote documentation for maintenance.
- Organized the Gists used in the build system into a Github repository called CI-Build-Tools.
- Slides
PythonEmbedInR
Folked from PythonInR repository.
Data Conversion
- Added
testthat
and a set of test cases to ensure the conversion behaviors. - Fixed bugs related to converting data from Python 3 to R.
- Ensured private methods in Python object are not exposed in R.
Python Package Wrapper Utilities
- Extended PythonEmbedInR package to include utilities functions that generate R wrappers, and documentations for Python packages.
- Allowed R users to select functions, classes, Enums, and module to expose in R.
- Enabled R users to intercept and alter the returned object in R.
- Provided instructions and examples on how to create an R package by wrapping a Python package.
synapsePythonClient
The Python client of Synapse.org.
Enable Single Thread
- Added the ability to use the Synapse Python client in a single threaded environment (in synapser).
House Keeping
- Stablized the integration tests.
- Improve development builds from taking several days to taking ~20 minutes.
- Simplified integration tests for using only one test user, only exercising the http call, not the server's logic.
- Directed the integration tests to build against a development stack.
Communication Channels
- Exposed released notes on the Synapse Python client docs.
- Highlighted deprecated methods in the Synapse Python client docs.
IT
Collections of IT tasks.
Test User for Development Stack
- Added a build to create a test user for a development stack.
- Allowed client builds to run integration tests against the development stack.