Bioinformatics – Project Portfolio

Showing 6 Result(s)

Transcriptomics CHO Alignment Pipeline

on July 4, 2024August 4, 2024

Summary Using fastqc, HISAT2, featureCounts, samtools, and Trimmomatic, an alignment pipeline was made for RNAseq data to the Chinese hamster ovary (CHO) genome. The annotation files were taken from NCBI. The workflow until the featureCounts step show’s an older annotation file. Note that the workflow is identical for the updated annotation file, but the updated …

Kaggle Leash Bio Therapeutics Competition

on May 7, 2024November 29, 2024

Source Code: GitHub Competition Overview In this competition, you’ll develop machine learning (ML) models to predict the binding affinity of small molecules to specific protein targets – a critical step in drug development for the pharmaceutical industry that would pave the way for more accurate drug discovery. You’ll help predict which drug-like small molecules (chemicals) …

Brain Region Enrichment Analysis

on April 21, 2024November 11, 2024

An exploration of the different biological enrichment algorithms and machine learning algorithms applied to an RNA expression dataset. Source Code: GitHub Motivation This project is an exploration of RNAseq data from Kaggle. When I initially downloaded this dataset, it was because I wanted to learn how to do data analysis on high dimensional biological data. …

Molecular Graph Decomposition

on April 20, 2024August 6, 2024

*Image credit to Ryu et from “Deeply learning molecular structure-property relation- ships using attention- and gate-augmented graph con- volutional network”

Dash and SQLAlchemy Dashboard

on April 19, 2024November 12, 2024

Source Code: GitHub Summary A stock forecasting Dash dashboard with a backend MySQL database; the database was setup in AWS RDS as well as a local MySQL server. The final project features the local database simply to avoid unnecessary costs. Below is a high level view of the programming layout. App.py has the main plotly …

CAFA 5 Protein Function Prediction (Kaggle Competition)

on August 25, 2023April 8, 2024

Competition Description: The goal of this competition is to predict the function of a set of proteins. You will develop a model trained on the amino-acid sequences of the proteins and on other data. Your work will help researchers better understand the function of proteins, which is important for discovering how cells, tissues, and organs …