Utkarsh Rai

Data & Infrastructure Engineer | Healthcare Systems

Healthcare Data • Cloud Architecture • ETL/ELT • High-Performance Computing

Salt Lake City, UT

(+1) 501-410-5118

urai@merkalis.io

Data and Infrastructure Engineer with 6+ years of experience designing scalable architecture and managing high-performance computing workflows. Proven expertise in building reliable data pipelines for complex, HIPAA-compliant medical datasets and healthcare informatics systems. Adept at translating healthcare and research requirements into robust technical solutions using cloud infrastructure (AWS, Azure), Python automation, and data engineering principles. Currently pursuing Ph.D. in Biomedical Informatics with published research contributions.

Experience

Founding Engineer

Merkalis Inc.
  • Designed scalable data ingestion pipelines and backend architecture for massive, IRB-governed clinical datasets, working closely with healthcare engineering leaders
  • Led development of a resilient storage system using Merkle trees to ensure verifiable data integrity, security, and version control across environments
  • Collaborated with cross-functional teams to integrate AI models into clinical workflows, translating complex technical requirements into reliable, domain-focused data services
Dec 2025 - Present

Assistant to Director, High Performance Computing

University of Arkansas System
  • Developed and orchestrated automated data processing workflows to support medical imaging research and large-scale data extraction in distributed cluster environments
  • Partnered directly with clinical researchers and non-technical stakeholders to gather requirements, establish governance standards, and build a standardized Canonical Data Model (CDM) for downstream analytics
  • Implemented Git-based version control and CI/CD pipelines using GitHub Actions to automate testing and ensure reproducible deployments across architectures
  • Participated in routine code reviews and maintained documentation to ensure reliability and long-term maintainability of data flows
Aug 2022 - Dec 2025

Data Engineer

C2FO
  • Built and optimized robust ETL pipelines using Python and Spark to extract, transform, and load financial data into the enterprise data warehouse
  • Designed dimensional data models and optimized complex SQL queries to feed reporting dashboards in Power BI and Tableau, supporting business intelligence and ad-hoc analysis
Jun 2021 - Aug 2022

Data Engineer

Reliance Jio
  • Engineered highly scalable ELT pipelines to ingest, process, and consolidate terabytes of high-velocity structured and unstructured telecommunications data
Aug 2020 - Jun 2021

ML Research Engineer

WorldQuant
  • Developed predictive models and integrated data processing scripts for retail and financial datasets, improving forecasting accuracy and analytics workflows
Jul 2018 - Aug 2020

Education

University of Arkansas for Medical Sciences

Ph.D. in Biomedical Informatics

In Progress • Focus: Healthcare Data Governance & Medical Image Management

2023 - Present

University of Arkansas at Little Rock

Graduate Certificate in Data Science
2022 - 2023

Birla Institute of Technology and Science, Pilani

B.E. in Computer Science
2016 - 2020
B.E. in Computer Science

Skills

Programming Languages & Tools
Key Strengths

Publications & Presentations

Conference Presentations

  • Oral Presentation (Selected Talk): Rai, U., & Tarbox, L. (2024). High Throughput JPEG2000 as a default storage format for medical images. 32nd International Conference on Intelligent Systems for Molecular Biology, Montreal, Canada.
  • Rai, U., & Koru, G. (2025). Enhancing Data Governance in Healthcare: A Composite Dashboard Approach for Maine's All Payer Claims Database. 33rd International Conference on Intelligent Systems for Molecular Biology, Liverpool, UK.
  • Rai, U., & Tarbox, L. (2024). Lossless High Throughput JPEG2000 as a default storage format for medical images. 16th Great Lakes Bioinformatics (GLBIO) Conference, Pittsburgh, PA.
  • Rai, U., & Tarbox, L. (2024). Lossless High Throughput JPEG2000 as a default storage format for medical images. Arkansas Bioinformatics Consortium (AR-BIC), North Little Rock, AR.
  • Rai, U., & Tarbox, L. (2024). High Throughput JPEG2000 as a default storage format for medical images. UAMS Student Research Day, Little Rock, AR.
  • Poster Presenter, Data Analytics that are Robust and Trusted (DART), Little Rock, AR (2024)
  • Talburt, J., & Rai, U. (2023). A novel method to evaluate and optimize unsupervised clustering tasks. DART YR 3 All Hands Meeting and Student Poster Competition, Springdale, AR.
  • Presenter, Tech4Good Summit, Bangalore, India (2019)

Preprints

Projects

Coming soon. Check back for recent projects and contributions.

Honors, Awards & Funding

Professional Activities & Teaching

Professional Memberships

  • Member, Engineering For Change (2018-Present) - Global network of engineers dedicated to solving humanitarian challenges
  • Member, Tech4Good (2018-Present) - Technology initiatives for social impact

Teaching Experience

  • Python Instructor, Centre for Technical Education, BITS Pilani Goa, India (2018-2019)

Community Leadership

  • Project Lead, Nirmaan (2016-2020) - Education initiative supporting underprivileged students in India