Data Science on Cloud Platforms

Cloud Analytics and Machine Learning

CiDrep Data Science and Analytics services are hosted and runs exclusively on digital/cloud platforms, which enables data science teams to easily organize their work, access data and computing resources from anywhere, build, train, deploy, and manage data models.

Our cloud services makes it easy for researchers and data scientists to take their projects from ideation to research discovery and publications, quickly and cost-effectively.

Analytics and Machine Learning Services
We are able to help you harness the power of Cloud array of big data processing and analytics tools to enable data-driven insights at speeds and volumes that were previously unimaginable. Combined with our strong informatics team and cognitive computing practices, CiDrep uses the Cloud’s powerful machine learning engine to provide next-generation machine learning offerings that help solve your toughest business challenges.

Overall our solutions makes data science teams more productive, and enables them to deploy and implement data pipelines, using statistical and machine learning methods and tools on any cloud platform. You will come to realize that our methodology is a hands-on guide to implementing an end-to-end data platform on GCP, AWS or IBM for DeepVariant analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.

  • Automate and schedule data ingest, using an App Engine application
  • Create and populate a dashboard in Google Data Studio
  • Build a real-time analysis pipeline to carry out streaming analytics
  • Conduct interactive data exploration with Google BigQuery
  • Create a Bayesian model on a Cloud Dataproc cluster
  • Build a logistic regression machine-learning model with Spark
  • Compute time-aggregate features with a Cloud Dataflow pipeline
  • Create a high-performing prediction model with TensorFlow 
  • Use your deployed model as a microservice you can access from both batch and real-time pipelines

Cloud Genomics platforms offers petabyte-scale genomic data processing and analysis for all of your data needs to: map to securely store, process, explore, and share large and complex genomic datasets; for GATK Best Practices for variant discovery in whole genome sequencing (WGS) data; RNA-Seq pipeline intended to show Nextflow usage; running a Sentieon DNAseq Pipeline; or running a dsub pipeline creates an index (BAI file) from a large binary file of DNA sequences (BAM file).