ICTR Data Science Core


The mission of the Data Science Core is to provide a platform for the collection, storage, analysis, and presentation of big data, including electronic health records and ‘omics data. Our team specializes in the use of HIPAA-compliant local and cloud computation solutions and machine-learning techniques to address the needs of data science research. We leverage Amazon Web Services and their downstream applications, such as SageMaker machine learning platform, Comprehend natural language processing, and Service Workbench collaborative environment, alongside customized on-premises data processing to offer secure storage, rapid analysis, and advanced outputs. The Data Science Core creates a solid foundation that is optimized for data science at every level to maximize researchers’ productivity.

Services include:

  • Data Acquisition from data warehouses and repositories
  • Bulk Data Processing to prepare data for downstream analysis
  • Data Analysis using data-appropriate analysis tools, including machine learning, and cloud-based processing
  • Data Construction, Maintenance and Manipulation via Redcap
  • Software Development, in particular bioinformatic pipelines and analysis tools
  • Data Visualization, including static visualization for publication and dynamic visualization for websites
  • Data Science Consultation to help researchers design a rigorous data science study that will stand up to peer review

If your data science project would benefit from a cloud-based platform optimized for your type of project or you are thinking about starting a data science project and would like some guidance with what to do next, contact the Data Science Core for an initial consultation.

Data Science Core Team

  • Bo Peng, PhD – Associate Professor
  • Dakai Zhu – IT Infrastructure & Systems Associate
  • Spiridon Tsavachidis – Lead Bioinformatics Programmer
  • Xiangjun Xiao – Lead Programmer Analyst
  • Ang Li - Member

Related Resources: