Bartosz Bryg

CS Researcher  ·  HPC  ·  AI  ·  Systems

Bartosz Bryg

Bartosz Bryg

Computer Scientist

New Jersey Institute of Technology
Albert Dorman Honors College

  • HPC & Scalable Systems
  • AI & Graph Algorithms
  • Honors Student · 4.0 GPA
  • NCAA Division I Athlete
  • Drones & Robotics

About Me

I'm Bartosz Bryg, a third-year Computer Science student at NJIT, minoring in Drones and Robotics while maintaining a 4.0 GPA. I'm part of the Albert Dorman Honors College and compete as an NCAA Division I swimmer — experiences that have strengthened my discipline, resilience, and ability to perform under pressure.

At NJIT's Institute for Data Science, I research high-performance graph algorithms for large-scale network analysis with PhD researchers and faculty. My work focuses on distributed graph analytics, scalable community detection, and AI-driven data analysis — contributing to collaborations with Harvard University and UIUC. I was recently recognized with the Grace Hopper AI Research Award for my work in cluster analysis.

My interests span high-performance computing, scalable algorithms, AI, distributed systems, and backend engineering. I enjoy building systems end-to-end, but I'm especially drawn to the large-scale infrastructure users never directly see — the kind that quietly powers everything behind the scenes.

Whether it's distributed graph frameworks, blockchain fraud detection pipelines, RESTful APIs, or ML systems, I'm drawn to solving hard engineering problems with elegant and scalable solutions — particularly at the intersection of performance, architecture, and practical impact.

Along the way, I co-authored a paper accepted at IEEE HPEC 2025 and had a talk accepted at ChapelCon 2025 while still in my junior year. For me, the most exciting part of engineering is turning complex ideas into systems that are fast, reliable, and built to scale.


Research Focus

High Performance Computing
Parallel Graph Algorithms
Large-Scale Graph Analytics
Distributed Programming
Algorithm Design to Solve Real-World Dataset Problems
Machine Learning & AI
Network Community Detection

Technical Skills

Low-level & OS programming Shared & distributed-memory concurrency HPC algorithms AI & ML modeling API & database engineering

Languages & Development

Python C++ Java C Chapel JavaScript Node.js Bash / Shell SQL Git

HPC & Distributed Systems

High-Performance Computing Graph Algorithms Distributed Systems Parallel Programming MPI / OpenMPI OpenMP SLURM Linux / Unix Docker Kubernetes

ML & AI Engineering

Machine Learning PyTorch Graph Neural Networks TensorFlow Scikit-learn XGBoost Computer Vision Anomaly Detection NumPy · Pandas

Backend & Infrastructure

REST APIs AWS Algorithm Design Databases & SQL Statistical Modeling Linear Algebra Optimization Research Publications Arkouda Performance Benchmarking

Experience

Undergraduate Researcher | AI/ML & HPC @ NJIT

September 2023 – Present · Institute for Data Science

Projects GitHub

Contributor to the Arkouda-NJIT repository

🌐 Well-Connected Communities (WCC) & Connectivity Modifier — UIUC Collaboration
  • Integrated Arachne with external C++ programs for Leiden community detection and VieCut minimum-cut computation, enabling seamless Chapel–C++ interoperability at scale.
  • Scaled community detection to 2B+ edge graphs, achieving <10 min runtime on 128 cores — enabling graph analysis at a scale previously impractical in the framework.
  • Designed and implemented the distributed-memory parallel executor for multi-node HPC execution, managing data partitioning and cross-node communication.

📄 Research Papers

HiPerMotif: Parallel Subgraph Isomorphism in Property Graphs
  • Co-authored a published research paper (IEEE HPEC '25) introducing HiPerMotif, a hybrid-parallel subgraph isomorphism algorithm implemented in Arachne.
  • Validated on the H01 human cortex connectome — a 147M-edge neuroscience graph — achieving 3–10× runtime and memory improvements over VF3P, Glasgow, and SLF.
  • Benchmarked on synthetic graphs (Erdős–Rényi, scale-free, small-world) and real-world datasets including Flywire, Hemibrain, Amazon, Twitter, and web-Stanford.
  • Currently extending to subgraph enumeration and motif discovery research within the Arachne framework.
🔍 Cluster Analysis Metrics for Network Pattern Detection (NSF Award #2109988)
  • Developed 100+ scalable clustering algorithms in Arachne for pattern discovery across neural, transportation, and social networks.
  • Implemented BFS, Leiden, Louvain, IKC, Brandes in C, C++, and Chapel — achieving ~400% performance improvement through parallelism and a custom C/C++ backend.
  • Supported Harvard's neural network research, enabling scalable graph analysis for brain connectome discovery.
  • Collaborated with Ph.D. researchers at NJIT and UIUC on metric-driven community detection across large-scale networks.
Aggregation Framework Prototype in Chapel — HPE Collaboration
  • Developed a distributed aggregation framework prototype in Chapel in collaboration with Hewlett Packard Enterprise (HPE), targeting high-throughput data movement in HPC environments.
  • Adapted the framework for the BFS kernel in Arachne, implementing 64-bit bitmap visitation arrays and replacing atomic byte arrays to reduce memory contention.
  • Low-level optimizations achieved a 76× speedup over the baseline — one of the largest single-PR performance gains in the Arachne codebase.
  • Accepted as a talk at ChapelCon 2025, presenting results to the Chapel language and HPC community.
⚙️ Distributed BFS Optimization & Graph Construction in Chapel
  • Enhanced BFS frontier expansion via custom aggregators, reorganizing visitation logic inspired by Graph500 to reduce runtime and improve multi-node scaling.
  • Developed a 64-bit bitmap visitation array replacing atomic byte arrays, dramatically reducing memory contention under high-thread parallelism.
  • Contributed to graph construction routines, improving edge ingestion throughput for large distributed graphs.

Software Engineering Intern · ML & Data Systems

NJIT Institute for Data Science · under Prof. David Bader

May 2024 – August 2024
🔗 Blockchain Graph Analytics — Fraud Detection at Scale
  • Applied graph analytics to blockchain transaction datasets to identify anomalies and potential fraud, tracing circular transaction patterns and suspicious wallet connections.
  • Implemented scalable fraud detection pipelines using advanced graph traversal algorithms, improving detection accuracy by 40% and reducing processing time by 50%.

University Committee Member

NJIT English Language Program

September 2023 – May 2024

Personal Projects

Real-Time Payment Processing & Fraud Detection System

C++ · Java · Python · Spring Boot · PostgreSQL · Graph Analysis · REST APIs

Thread-Safe Rate Limiter

Java · Concurrency · Token Bucket Algorithm · ConcurrentHashMap · AtomicLong · Synchronized Data Structures

Implemented a per-user rate limiter using the token bucket algorithm, enforcing long-term rate limits while allowing short bursts. Supports 50+ concurrent users in simulation without violating constraints. Thread safety achieved via synchronized access patterns and concurrent data structures — with no race conditions under adversarial timing.

In-Memory Cache Engine with LRU & TTL

C · Java · HashMap · LinkedList · TTL · System Design · Memory Management

Built an in-memory cache supporting fast data access via LRU eviction (LinkedHashMap-backed O(1) get/put) and time-based expiration (TTL) — automatically invalidating stale entries on access without added processing overhead. Validated across concurrent read/write scenarios, demonstrating consistent-time performance at capacity.

Automated Flight Deal Finder

Python · Java · REST APIs · Amadeus API · Sheety API · HTTP · JSON Processing · Maven

Built a fully automated system that resolves missing IATA airport codes via API, retrieves real-time flight prices across routes, compares them against stored thresholds, and fires automated price-drop alerts. Integrated HTTP/JSON APIs to keep records up to date and notify users when fares fall below their target.

Computer Vision – Smart Video Understanding

Python · YOLO · DeepSORT · BLIP · PyTorch · OpenCV

End-to-end real-time computer vision pipeline: YOLO detects objects per frame, DeepSORT assigns persistent cross-frame track IDs, and BLIP generates natural language scene descriptions. Detects semantic events (objects entering/leaving frame, crowd density changes) and outputs an annotated video alongside a structured event log.

Adaptive ML Pipeline with Interactive Prediction Interface

Python · Scikit-learn · Dash · AWS Elastic Beanstalk · GridSearchCV

Full ML pipeline generalizing to any uploaded CSV: automated preprocessing, multi-model comparison (Logistic Regression, Decision Tree, Random Forest) with GridSearchCV tuning, and a deployed Dash web app with real-time classification and regression predictions. Deployed on AWS Elastic Beanstalk / Render via Gunicorn.

Graph Analytics Toolkit

Chapel · Python · MPI · OpenMP · HPC · Graph Metrics · Arkouda

High-performance graph metrics library for large-scale datasets using Chapel and Python. Designed for parallel execution on multi-node HPC clusters, supporting community detection, centrality, and connectivity metrics across billion-edge graphs.

ML-Driven Network Anomaly Detection

Python · PyTorch · GNNs · Transformers · NetworkX · Scikit-learn

Used Graph Neural Networks (GNNs) and transformer architectures to identify structural anomalies in financial transaction graphs — learning latent embeddings that distinguish normal transaction patterns from fraud clusters.


Relevant College Courses

Introduction to Computer Science 1 & 2

Comprehensive intro to CS using Java and Python — OOP, recursion, inheritance, generics, and core data structures (linked lists, stacks, queues, trees, graphs). Covered algorithm design and complexity analysis with an emphasis on building correct, efficient programs from scratch.

  • Built a generic BinarySearchTree<T> with recursive insert, delete (two-child case via in-order successor), and a custom Iterator<T> inner class — mirroring exactly how Java's standard collections work under the hood.
  • Implemented HeapSort and benchmarked it against O(n²) sorts on large random and adversarial inputs — empirically confirming asymptotic theory with measured data.

Foundations of CS – Discrete Mathematics

Mathematical foundations of CS: propositional logic, set theory, proof techniques (induction, contradiction, contrapositive), functions, asymptotic notation, recurrence relations, and introductory graph theory — the toolbox behind every algorithm analysis.

  • Derived closed-form solutions for recurrence relations from first principles — the same substitution method used to verify runtime bounds for the parallel graph algorithms I develop in research.
  • Modelled network connectivity and cycle detection as graph reachability problems using DFS — applying the reasoning before ever formally taking a graph algorithms course.

Foundations of CS II – Theory of Computation

Automata theory, computability, and complexity — DFAs, NFAs, PDAs, Turing machines, undecidability (Halting Problem), and complexity classes P, NP, NP-Complete. A rigorous, proof-heavy course answering what computers can and cannot solve efficiently.

  • Implemented the DFA emptiness decision algorithm (BFS-based reachability from start state) that correctly classifies any encoded DFA — turning abstract decidability theory into working code.

Programming Language Concepts

PL design from first principles: syntax, semantics, lexical/syntactic analysis, variable binding, scope and lifetime, control flow, OOP, and exception handling. Built a full language toolchain in C++ using manual memory management — no garbage collector.

  • Built a complete language toolchain in C++ from scratch: lexer → recursive-descent parser → AST interpreter with variable scoping, type safety, and control flow — the same pipeline that underlies every production compiler.
  • Managed all heap allocations manually with new/delete and deep-copy semantics throughout — a strict no-GC environment that sharpened low-level memory intuition directly applied in HPC C++ work.

Machine Learning

Supervised, unsupervised, and reinforcement learning — k-NN, logistic/linear regression, SVMs, decision trees, random forests, boosting (AdaBoost, XGBoost), ANNs, RNNs, transformers, VAEs, GANs, PCA, clustering, autoencoders, and deep RL. Hands-on with PyTorch and scikit-learn.

  • Designed a VAE for unsupervised generation — interpolating through the latent space to produce novel samples — then benchmarked against a GAN trained on identical data to compare output quality and training stability.
  • Shipped a full-cycle ML product: preprocessing → GridSearchCV selection → Dash web app deployed on AWS with live inference ( View Code).

Data Science

Full data science pipeline from EDA to deployment — NumPy, Pandas, Plotly/Dash, regression, classification, tree methods, PyTorch-based neural networks, RNNs, transformers, LLMs, and agentic AI with LangChain/LangGraph.

  • Deployed a multi-model prediction web app (Logistic Regression, Decision Tree, Random Forest) on Render — accepts any uploaded CSV, performs automated preprocessing, and returns live classification or regression results ( View Code).
  • Built a LangChain/LangGraph multi-agent pipeline with autonomous multi-step reasoning, tool use, and memory — an AI agent that plans its own approach, fetches external data, and synthesises answers end-to-end.

Computer Systems

Computer organisation from a programmer's perspective — C programming, x86-64 assembly, CPU pipelining, memory hierarchy, caching, virtual memory, and address translation. Explored how high-level C maps all the way down to machine instructions.

  • Wrote x86-64 assembly interfacing with compiled C — understanding exactly how compilers generate code, manage stack frames, and allocate registers, which directly informs writing performance-critical C/C++ for HPC.
  • Simulated virtual-to-physical address translation end-to-end: multi-level page table walks, TLB miss modelling, and cache hierarchy performance measurement — connecting theory to real HPC memory bottlenecks.

Intensive Programming in Linux

Comprehensive Linux programming — Bash scripting, systems-level C with dynamic memory and pointers, multi-threading, OpenMPI for distributed computing, web scraping, and end-to-end high-performance application development.

  • Built a distributed real-time stock tracker with MPI: one process scrapes live prices while workers rank the top 100 — a practical map-reduce pipeline across multiple machines, directly translating to my HPC research workflow.
  • Implemented A* pathfinding over a dynamically-allocated 2D grid in C with full malloc/free memory management, zero memory leaks verified under Valgrind.
  • Engineered a multi-threaded web scraper in C using pthreads — concurrent fetching and parsing of financial pages protected by mutex locks across all shared data structures.

Advanced Data Structures & Algorithms

Senior-level algorithm design and analysis — sorting, balanced BSTs (AVL, Red-Black), hash tables, heaps, graph algorithms, and core paradigms: divide-and-conquer, greedy methods, and dynamic programming.

  • Implemented Dijkstra SSSP and Floyd-Warshall APSP, then modelled shortest communication paths in network topology graphs — directly applicable to routing in the distributed HPC systems I work on in research.
  • Built AVL and Red-Black trees with full rotation logic and proved the height invariant — the same O(log n) guarantee behind B-tree indexing in production databases.

Intro to Cybersecurity

Broad overview of cybersecurity — CIA triad, cryptography (AES, RSA, hash functions, PKI), network security, firewalls, IDS/IPS, malware, web vulnerabilities (SQLi, XSS, CSRF), and user authentication. Hands-on with Kali Linux and SEED Labs.

  • Full recon-to-exploit lab on Kali Linux: nmap service enumeration → crafted a packet that bypassed a misconfigured SNORT IDS rule — then switched sides to patch it, experiencing attack and defence in a single exercise.
  • Demonstrated a live SQL injection attack on a login form, then patched it with parameterised queries — understanding exactly why user input must never reach a raw SQL string.

Database Design & Management

Relational database theory and practice — ER/EER modelling, relational algebra, SQL (DDL, DML, complex queries, views), functional dependencies, 3NF normalisation, indexing, hashing, and transaction processing.

  • Built a team database-backed application end-to-end: 3NF-normalised schema, multi-table SQL queries with window functions and correlated subqueries, index-tuned query plans, and atomic transactions with rollback — production-quality database engineering.

Principles of Operating Systems

OS fundamentals — processes, threads, synchronisation (mutexes, semaphores), deadlock detection/prevention, CPU scheduling, physical and virtual memory, paging, segmentation, and I/O device management.

  • Solved Producer-Consumer and Readers-Writers with semaphores and mutex locks — the exact synchronisation primitives underlying the concurrent graph analytics engines I build in my HPC research.

Computer Networks

Internet protocol stack from link to application — ARP, IP addressing, DHCP, routing algorithms (BGP, Dijkstra), TCP/UDP, congestion control, HTTP, DNS, and NAT. Hands-on with Wireshark and socket programming.

  • Built a networked multiplayer game over raw TCP sockets with a custom application-layer protocol (handshake, move validation, GG signalling) — then extended it with a RESTful Flask matchmaking API handling skill-based pairing and Elo updates.

Robotics Science Fundamentals

Mobile robot kinematics, sensor data processing (ultrasonic, IMU, gyroscope), linear/nonlinear motion control, hardware-in-the-loop simulation, and motion planning — implemented in Python for real differential-drive robots.

  • Capstone: built a full autonomous perception pipeline — YOLO object detection → DeepSORT persistent tracking → BLIP natural language scene description — deployed on a real differential-drive robot ( View Code).

Linear Algebra

Comprehensive linear algebra: Gauss-Jordan elimination, LU/CR/QR/SVD factorizations, determinants, eigenvalues and diagonalization, the four fundamental subspaces (column, row, null, left-null), least squares, and orthogonal diagonalization — the mathematical backbone of HPC, ML, and scientific computing.

  • Applied SVD to implement low-rank image compression — rank-k approximations that trade storage for fidelity, the same technique behind PCA, recommender systems, and latent semantic indexing.
  • Analysed graph adjacency matrix eigenstructure to derive spectral embeddings — directly linking textbook linear algebra to the spectral clustering methods used in my own research.

Probability & Statistics

Rigorous probability and statistical inference — discrete distributions (Binomial, Geometric, Poisson), continuous distributions (Uniform, Exponential, Normal), Bayes' theorem, hypothesis testing (z-tests, t-tests, Type I/II errors), confidence intervals, sample-size determination, and linear regression.

  • Implemented a statistical A/B testing pipeline — z-scores, p-values, confidence intervals, and power analysis — then used it directly to validate performance claims in my HPC research benchmarks.
  • Modelled graph event arrivals as a Poisson process, fitted parameters via MLE, and validated fit with a chi-squared test — applied to characterise query workloads in distributed graph analytics.

Awards & Honors

Honors College

Honor Student & Dorman Scholar — Albert Dorman Honors College, NJIT

Selective admittance into NJIT's Albert Dorman Honors College, issued to first-year students demonstrating exceptional academic performance. The Honors College provides access to advanced research opportunities, distinguished faculty mentorship, and a community of high-achieving students.

2024
Academic Excellence

Dean's List — Fall 2023 through Spring 2026

Earned Dean's List distinction every semester since enrollment at NJIT, maintaining a 3.5 GPA while simultaneously conducting full-time HPC research and competing as an NCAA Division I athlete — six consecutive semesters.

Fall 2023Spring 2024Fall 2024Spring 2025Fall 2025Spring 2026
2023 – 2026
National Scholarship

Prime Minister's Scholarship — Poland

Competitive national scholarship awarded by the Prime Minister of Poland to high school graduates achieving the highest GPA and outstanding performance on national university entrance examinations — recognizing top academic achievers across the country.

2022
NCAA Division I Athletics
CSC Academic All-District

CSC Academic All-District Team — NJIT Men's Swimming & Diving  Two-Time Honoree

Selected to the College Sports Communicators (CSC) Academic All-District® Team in both 2025 and 2026, recognizing student-athletes who achieve at the highest level academically while competing at the NCAA Division I level. One of only four NJIT swimmers honored in each selection cycle.

2025 Honoree 2026 Honoree
2025 – 2026

As a NCAA Division I swimmer at NJIT, I train and compete at the highest collegiate level while maintaining a 4.0 GPA and conducting research. The same attributes that make a competitive swimmer — precision under pressure, structured discipline, and consistent high performance — are what I bring to every research problem and system I build.


Contact Me

Feel free to reach out to me via email at bartosz.bryg@gmail.com.