C:/

3 items

30.8MB

Resume.pdf

README.md

Research

Favorites

7 items

405K in folder

506K available

Apps

Articles

Blogs

Board Games

Coffee

Inspiration

Tools

README.md

Michael Rizig

Software Engineer. MS Student. Dreamer.

Atlanta, Ga | 678.668.3294 | michaelrizig@gmail.com | https://michaelrzg.github.io/

Who am I?

As a MS CS stuent at KSU, I am a Software engineer with a fresh background in computer science and a passion for growth. I am highly motivated and results driven, well-versed in the concepts and practices of programming, and am proficient in a variety of programming languages

Since my middle school days, I have always had a passion for programming. My first ever game project was a 2d game written in Java when I was 14, I called it Epic Adventure (very creative). While I've since moved on from game development, it was my fondness for video games that started my career as a programmer.

‍

Education

MS IN COMPUTER SCIENCE | DEC 2025| KENNESAW STATE UNIVERSITY, ATLANTA GA (DUAL DEGREE PROGRAM)

BS IN COMPUTER SCIENCE | MAY 2025 | KENNESAW STATE UNIVERSITY, ATLANTA GA. (3.81 GPA) CONCENTRATION IN ARTIFICIAL INTELLIGENCE

‍

Experience

STUDENT / FREELANCE PROJECTS | SEPTEMBER 2021 – PRESENT
· Recent Project: Developed a real-time data processing application for Gwinnett County Public Schools, integrating the Samsara Kafka Connector to consume Asset Location and Asset Speed events. Implemented data extraction, format checks, and storage in an SQL server database, with separate tables for valid and rejected events. Utilized Docker for containerized deployment, ensuring consistent and scalable solutions. Enhanced system efficiency with real-time monitoring and error handling, transitioning from API polling to Kafka-based streaming for near real-time visibility into school bus operations. Gained hands-on experience with Python/C# and full-stack development, contributing significantly to operational efficiency.

IT INTERN | DELTA COMMUNITY CREDIT UNION | MARCH 2020 – AUGUST 2021
· As an IT Intern at DCCU I had a hand in creating and managing large scale IT deployments, managing system monitoring and health metrics, and developing in-house fraud detection software. · Helped with implementing database security and Redhat Linux administration and management. Had a hands on role in Data Compilation, and using data engineering techniques to understand trends.

APPRENTICE TELLER | DELTA COMMUNITY CREDIT UNION | MAY 2019– MARCH 2020
· Highlight on Customer Service and Sales. As a teller I handled large quantites of physical and digital currencies, managed accounts, made wire transfers, and helped open and close member accounts. · Gave support and hands-on help in day-to-day processes at a base level at Delta Community Credit Union.

Kafka

Real Time Event Streaming

Client: Gwinnett County Public Schools

View Source Code

Developed a real-time data processing application for Gwinnett County Public Schools, integrating the Samsara Kafka Connector to consume over 2000 simultaneous Asset Location and Asset Speed events. Implemented data extraction, format checks, and storage in an SQL server database, with separate tables for valid and rejected events. Utilized Docker for containerized deployment, ensuring consistent and scalable solutions. Enhanced system efficiency with real-time monitoring and error handling, transitioning from API polling to Kafka-based streaming for near real-time visibility into school bus operations. Gained hands-on experience with Python and full-stack development, contributing significantly to operational efficiency.

Documentation

Planning
Requirements
Design

Demos

Initial Kafka Development
Database Development
Dataflow Development

‍

Overview

General Design and Dataflow:

This real time server application has two main processes: Event Streaming and Data Handling

‍

Natural Language Processing

Logistical Regression Sentiment Analysis

Machine Learning to determine the sentiment of real amazon product reviews

View Source Code

Overview

Logistic Regression Classifier Algorithm in Python used to predict sentiment of reviews (either negative or positive). Classification, training, and testing using a real Amazon product review dataset.

Step by Step Guide:

Step 1:

The first step is data preprocessing. We will start preprocessing by removing stop words from our data.
The preprocess() function reads our csv raw data and outputs 2 new files:
-test_formatted.csv : test_amazon.csv with all stopwords removed
-train_formatted.csv : train_amazon.csv with all stopwords removed

notes: - this function will only run if test_formatted.csv and train_formatted.csv do not exist - ie this will only run the first time you run main.py - you can run this portion of the program manually by running the preprocess_dataset.py script

Step 2:

The second step is to extract features from our now preprocessed data. The extract_features() function takes in both test_formatted.csv and train_formatted.csv and outputs 2 new files:
-testing_features.csv : holds all features as well as class labels from testing dataset
-training_features.csv : holds all features as well as class labels from training dataset

notes: - a 'feature' set are described as:

X = [x1,x2,x3,x4,x5, c]
where:
x1 = # of positive lexicons
x2 = # of negative lexicons
x3 = if no ∈ sample (either 0 or 1 value)
x4 = ∃ '!' ∈ sample (either 0 or 1)
x5 = log(word count)
c = class (1 = positive class 0 = negative class) - if testing_features.csv and training_features.csv already exist this function will not run

Step 3:

Next is to load features from our feature files.
The load_features() function will run every time the model is run, and will load our features from training_features.csv and testing_features.csv into memory.
Below is an example of what a feature looks like:

(Positive word count, Negative word count, "no" count, "!" count, log(word count), class) - 2 , 0 , 0 , 1 , 1.1139433523068367 , 1

Step 4:

Its time for our model to learn the features. Firstly, we 'fit' the model to our features which is our training step.

For our features set X and label set y, we call logistical_regression.fit(X,y):
X = m x n
m = # of data samples
n = number of feature
y = 1 x m matrix of correct labels for X

fit() works by initilizing a weights to a (1 x n) matrix of zeros and bias to 0
For n itterations:
- We take the dot product of our sample and weights matricies, then add our bias (similar to linear regression). By applying sigmoid to this output, we are converting this to logisitcal regression, and we get an output value between 0 and 1
- We take this output, determine the error (loss), and take the derivative of that to determine our grad_weight and grad_bias - We then multiply these values with our learning rate and subtract them from our weights and bias to determine our new weights for the next itteration - By the end of n itterations, we have 'optimized' our weights and biases based on slowly approaching our local minimum for our loss funciton

Step 5:

Final step is to run and generate our accuracy and confusion matrix:
The logistical_regression.run() function takes in a sample and its expected output, and runs the predict on that sample. It then calculates the accuracy as a function of correct/total and calculates the confusion matrix, storing our true positive (TP), true negative (TN), false positive (FP), False Negative(FN) values.

Machine Learning

Machine Learning Projects

CS7267: Advanced Machine Learning

View Source Code

This project is a series of subprojects for the MSCS course Advanced Machine Learning. These Machine Learning Projects including Unsupervised Learning via K-means algorithms to classify unlabeled data and sort them into clusters based on similarity and Supervised Learning via KNN learning and classification of validation data. This courses covers and utilizes machine learning techniques for clustering, data classification, supervised and unsupervised learning, deep neural nets, etc.

Documentation

Unsupervised Learning
Supervised Learning

Machine Vision

Machine Vision Projects

CS4247: Machine Vision

View Source Code

Machine Vision Projects including image resolution manipulation, image enhancements, morphological filters such as image smoothing and sharpening, and deep learning for segmentation and classification of images.

Documentation

Image Resolution
Image Enhancement
Morphological Filtering
Deep Learning

Neural Network From Scratch

No Libraries! (except for numpy for math)

View Source Code

This Neural Network was built from scratch using no prebuilt frameworks or libraries (except math via numpy). This neural network classifies spiral datasets with 99% accuracy. It utilizes categorical cross entropy loss functions, stochastic gradient descent optimization, and uses softmax and relu activation functions.

Handwriting Recognition Bayes Classifier

CS3642: Artifical Intelligence

View Source Code

Handwritten Number Recognition via Statistical Supervised Learning Algorithms utilizing a Naive Bayes Classifier written in C++. Utilizes MNIST binary dataset of 10000 handwritten numbers for training and testing.

TCP Server Router Concurrency

TCP Serve Router Concurrency

CS4504: Parallel and Distributed Computing

View Source Code

A client server paradigm scheme between pairs of Clients and Servers, and a router to direct traffic and build a routing table. Research papers testing impact of different thread management and work parallelization in server tasks. Model supports transmitting audio and video files.

Documentation

Part 1 Parallel and Concurrent Server Process IPC
Part 2 Distributed Computing via Computing Load Distribution

LSTM From Scratch

Long Short Term Memory

The backbone behind LLMs and Transformers

View Source Code

This Long Short Term Memory Model was built from scratch using no prebuilt frameworks or libraries (except math via numpy). Long short-term memory (LSTM) is a type of recurrent neural network (RNN) that can process and retain information over multiple time steps. LSTMs are used in deep learning and artificial intelligence to learn, process, and classify sequential data, such as text, speech, and time series. LSTMs are designed to prevent the neural network output from decaying or exploding as it cycles through feedback loops. This is called the vanishing gradient problem, which traditional RNNs face. LSTMs use gates to capture both short-term and long-term memory, and to regulate the flow of information into and out of the cell. The three gates are the input gate, the output gate, and the forget gate.

Walkthrough

Below is a high level walkthrough of how the code works. For more details, please refer to the documentation in the code.

Phase 1: Forward pass

Step 1: Initilization

First step in a LSTM model is initilizing our short term (h) and long term (C) weights, as well as our input and bias weights to np.zero. In our model, we initilize them both to np.zero values. For the first itteration, the values are randomized, but for each recurring pass, we use the previous output short term and long term weights. We also initilize our grad weights to random values.

 # set up lists to store for short term, long term, and c_tilde (last gate) states at timestamps 0 to t
self.short_term_memory = [np.zeros((self.n_neurons,1)) for x in range(self.max_vector+1)]
self.long_term_memory  = [np.zeros((self.n_neurons,1)) for x in range(self.max_vector+1)]
self.update_gate    = [np.zeros((self.n_neurons,1)) for x in range(self.max_vector)]

# forget gate values
self.dUf = 0.1*np.random.rand(self.n_neurons,1)
self.dbf = 0.1*np.random.rand(self.n_neurons,1)
self.dwf = 0.1*np.random.rand(self.n_neurons,self.n_neurons)

# input gate values
self.dUi = 0.1*np.random.rand(self.n_neurons,1)
self.dbi = 0.1*np.random.rand(self.n_neurons,1)
self.dwi = 0.1*np.random.rand(self.n_neurons,self.n_neurons)

#output gate values
self.dUo = 0.1*np.random.rand(self.n_neurons,1)
self.dbo = 0.1*np.random.rand(self.n_neurons,1)
self.dwo = 0.1*np.random.rand(self.n_neurons,self.n_neurons)

# update gate
self.dUg = 0.1*np.random.rand(self.n_neurons,1)
self.dbg = 0.1*np.random.rand(self.n_neurons,1)
self.dwg = 0.1*np.random.rand(self.n_neurons,self.n_neurons)

Step 2: 'forget' Gate

Once we have our values initilzed, we can begin our first pass. We start by calculating the 'forget gate' which determines what ratio of our previous long term memory we are going to utilize for this pass.

To determine the forget gates outcome, we find the dot product of our input and our short term memory [t] times our input weights and our short term memory weight for at [t] and add or bias:

# forget gate (determine how much of long term memory we will remember or 'forget')
# outout = input * weight + short term * short term weight + forget gate bias
outputf = np.dot(self.Uf, x) + np.dot(self.wf,ht) + self.bf

We then utilize sigmoid as an activation function to map the output of the forget gate to a value between 0 and 1, representing the ratio of our long term memory we are keeping. We save this ratio in our ft variable.

# use our predefined array of sigmoid obje-cts
Sigmf[t].forward(outputf)
#grab output for later
ft = Sigmf[t].output

This value (ft) will be multiplied to our long term memory (C) in the next step.

Step 3: Input Gate

The next step has 2 substeps: (1) calculating what percentage of our new value to add or 'remember' in our long term memory and (2) determining the actual value to add.

Screenshot 2024-10-11 at 8 22 40 PM

We start by calculating same liner combination as our forget gate but with out input gate.

#repeat for input gate
outputi = np.dot(self.Ui, x) + np.dot(self.wi,ht) + self.bi
Sigmi[t].forward(outputi)
it = Sigmi[t].output

This output value from our sigmoid (it) determines the first substeps value (calculating what percentage of our new value to add or 'remember' in our long term memory). The next step requires a different activation function, the TanH, which returns a value between -1 and 1.

Step 4: C_Tilde Gate and Updating Long Term Memory

We use the third gates, or the c_tilde gate value and multiply it with our input gates value to determine what value to add to our long term memory:

# finally repeat for c-tilde
outc_tilde = np.dot(self.Ug,x) + np.dot(self.wg,ht) + self.bg
tan_1[t].forward(outc_tilde)
c_tilde = tan_1[t].output

Now we finally combine our values by multiplying our input gate value by our long term, then adding that to the product of our input gate and our ctilde gate:

ct = np.multiply(ft,ct) + np.multiply(it,c_tilde)

Now out long term memort is updated.

Step 5: Updating Shourt Term Memory

Screenshot 2024-10-11 at 8 31 50 PM

Finally, we update the short term memory to end this iteration and pass the values to the next iteration. We update the Short Term Memory by multiplying the outcome of the output gate by the TanH of the long term memory.

ht = np.multiply(tan_2[t].output,ot)

At this point, all thats left for our forward pass is to update our logs.

# update our short (h) long (c) and ctilde logs
H[t+1] = ht
C[t+1] = ct
C_tilde[t] = c_tilde

#update logs of innter values
F[t] = ft
O[t] = ot
I[t] = it

Step 6:

Since we need to make a decision, we need a set of 2 simple dense layers to give us a decision value. These are simple dense layers with no bells or whistles so I'll save you the redundant explanation. Heres what the forward pass of the dense layers looks like:

 #forward propogation
def forward(self,inputs):
    self.output = np.dot(inputs,self.weights) + self.bias
    self.inputs = inputs

Phase 2: Back Propogation

Here comes the tricky part. Since there are dozens of 'learnable values' and several sets of weights. Back propogation can be scary. A very simple explaination is we are collecting the gradients of the loss for the different gates, performing the chain rule since they are nested, then multiplying the gradients by our learning rates and subtracting that for each weight set. Heres what the implementation looks like.

def backward_pass(self,dvalues):
    # grab our max vector size, short term, and long term memory
    T = self.max_vector
    H = self.short_term_memory
    C = self.long_term_memory
    # our saved inner gate values
    O = self.output_gate
    I = self.input_gate
    C_Tilde = self.update_gate

    # data
    data = self.data
    #activation functions

    sigf = self.Sigf
    sigi = self.Sigi
    sigo = self.Sigo
    tan1 = self.Tan1
    tan2 = self.Tan2

    #BPTT
    dht  = dvalues[-1,:].reshape(self.n_neurons,1)

    for t in reversed(range(T)):
        # working backwards

        # get data in form we can work with
        xt = data[t].reshape(1,1)
        #backwards prop to get dtan2
        tan2[t].backwards(dht)
        #store output of backwards call
        dtanh2 = tan2[t].dinputs

        # dht with respect to tanh
        dhtdtanh = np.multiply(O[t],dtanh2)

        # dct with respect to dft
        dctdft =np.multiply(dhtdtanh,C[t-1])

        #dct with respect to dit
        dctdit = np.multiply(dhtdtanh,C_Tilde[t])

        #dct with respect to dct_tilde
        dctdct_tilde = np.multiply(dhtdtanh,I[t])

        #backwards prop to get dtan1
        tan1[t].backwards(dctdct_tilde)
        #store output of backwards call
        dtanh1 = tan1[t].dinputs


        ...

        #finally, we find derivative of short term memory
        dht = np.dot(self.wf,dsigmf) + np.dot(self.wi,dsigmi)
        +np.dot(self.wo,dsigmo) + np.dot(self.wg,dtanh1)
        +dvalues[t-1,:].reshape(self.n_neurons,1)
    self.short_term_memory = H