Data Science on the Google Cloud Platform

Cloud Data AScience Services

CiDrep Data Science and Analytics services are hosted and runs exclusively on digital/cloud platforms, which enables data science teams to easily organize their work, access data and computing resources from anywhere, build, train, deploy, and manage data models.

Our cloud services makes it easy for researchers and data scientists to take their projects from ideation to research discovery and publications, quickly and cost-effectively.

Overall our solutions makes data science teams more productive, and enables them to deploy and implement data pipelines, using statistical and machine learning methods and tools on any cloud platform. You will come to realize that our methodology is a hands-on guide to implementing an end-to-end data platform on GCP, AWS or IBM for DeepVariant analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.

  • Automate and schedule data ingest, using an App Engine application
  • Create and populate a dashboard in Google Data Studio
  • Build a real-time analysis pipeline to carry out streaming analytics
  • Conduct interactive data exploration with Google BigQuery
  • Create a Bayesian model on a Cloud Dataproc cluster
  • Build a logistic regression machine-learning model with Spark
  • Compute time-aggregate features with a Cloud Dataflow pipeline
  • Create a high-performing prediction model with TensorFlow 
  • Use your deployed model as a microservice you can access from both batch and real-time pipelines

Our partner’s Cloud Genomics platorms offers petabyte scale genomic data processing and analysis for all of your data needs to: map to securely store, process, explore, and share large and complex genomic datasets; for GATK Best Practices for variant discovery in whole genome sequencing (WGS) data; RNA-Seq pipeline intended to show Nextflow usage; running a Sentieon DNAseq Pipeline; or running a dsub pipeline creates an index (BAI file) from a large binary file of DNA sequences (BAM file).