My Portfolio
Help Me README
This tool automatically generates README.md files for GitHub repositories using Google AI.
Spark Configuration Calculator
The Spark Configuration Tool is a Streamlit-based application designed to assist users in optimizing Apache Spark configurations. It allows users to input various parameters related to cluster, node, and executor configurations, providing recommended Spark configurations based on those inputs.
Memorystore for Clusters - Export Utility
A utility to export data from Memorystore for Redis clusters to a JSON format. It's a wrapper around RIOT with logging, error handling and a Python interface that can be used with a Cloud Function
Ask-Me-Anything using VertexAI + LangChain + Streamlit
The app will allow users to specify the sitemap of a website, crawl the entire site based on the URLs in the sitemap, and use the information as a knowledge base to answer user queries. We’ll use VertexAI embeddings & language models to understand user queries and provide relevant and accurate responses
Quiz Quotient Web App
When I was 12, I compiled & published a MCQ Quiz Book with 1000 questions. I decided to host my dataset of 1000 questions on the cloud, expose it via an API endpoint and develop a small PHP page to publish it.
Migrating from Tableau to Amazon QuickSight
This guide documents the high-level process of migrating dashboards from Tableau to Amazon QuickSight.
Real Time Stream Generator
The script reads a file with sample JSONs and creates log files. These log files can be configured to be monitored by Kinesis
SSIS Migration Accelerator - SQL Extractor
This utility extracts SQL statements from an SSIS package and creates a config file using a key-value format where the key is the name of the SSIS stage containing the SQL and the value is the SQL Query.
Handle Changing Reference Data In A Glue Streaming Job With High Availability
A solution architecture to solve the problem of changing reference data. in a Glue Streaming job.
DataStage Bash Wrapper Script
A DataStage wrapper script written in bash. This script can be used to invoke DataStage sequencers and further be scheduled through 3rd party schedulers. It performs a variety of checks and validations before invoking the job.
ETL Alert Framework
A framework to create a standardized feedback mechanism for ETL processes while keeping the developer free from implementation details of the alert system
Amazon OpenSearch on AWS Data Wrangler
An open source python initiative that extends the power of Pandas library to AWS connecting DataFrames and AWS data related services.
I collaborated with an AWS ProServe team to build OpenSearch functionality into the AWS Data Wrangler library
Automated Oracle Audit Trigger Creation Script
These scripts provide a way to automatically create audit triggers on tables that are frequently manipulated by users in Oracle