Avatar

Devansh Chandak

Computer Science Undergrad

Indian Institute of Technology, Bombay

Hi there!

I am Devansh Chandak, a third year undergraduate at the Department of Computer Science and Engineering at the Indian Institute of Technology, Bombay. I am fascinated by all areas of Computer Science and what it can do.
I have experience in deep learning, quantitative analytics, software development, algorithmic trading, cryptography and verification.

Recently, I a co-authored a publication at the COLING 2020 conference. It was on Multi-Hop Inference for Explanation Regeneration, a shared task in Textgraphs-14.

Currently, I am working as a Software Engineering Intern at Microsoft in the Azure Compute Group at IDC. I am working on OneFleet Autopilot.

Please go through the website to have a look at my internship and research experience, projects and skills. To get a better insight on my life and my achievements so far, you can have a look at my Curriculum Vitae/ Resume.

Interests

  • Deep Learning & Natural Language Processing
  • Software Development
  • Cloud Computing
  • Database and Information Systems
  • Artificial Intelligence
  • Quantitative Finance & Algorithmic Trading
  • Cryptography & Verification

Education

  • BTech in Computer Science, 2018 - Present

    Indian Institute of Technology, Bombay

Experience

For a detailed and exhaustive list, please refer to my CV/ Resume

 
 
 
 
 

Software Engineering Intern

Microsoft

May 2021 – Jul 2021 Hyderabad, India
  • Worked in the OneFleet Autopilot team in the Azure Compute Group (part of Cloud + Artificial Intelligence division) at IDC
  • As part of an exercise to consume datacenter inventory in a unified format, updated a critical tool used by Autopilot teams, tenants and services to enumerate fault domains for servers in different datacenters to support both types of inventory formats in use today while we migrate from one to the other. In addition, extended the same tool to work both when run on a developer machine or in the datacenter.
  • Used C#, C++/CLI and C++ and Cockpit (for querying)
 
 
 
 
 

Software Development Intern

Motilal Oswal Financial Servies

Dec 2020 – Jan 2021 Mumbai, India
  • Designed an HR Compliance portal with different functions for different user types, using C#, ASP.NET and MS-SQL
  • Features: add/edit user and document category details, upload documents in each category and view them in a repository
 
 
 
 
 

Research Intern

INRIA

Apr 2020 – May 2020 Nancy, France

Cryptography : Formal Verification of security protocols:

  • Studied operational semantics and equivalence properties (in the applied pi calculus and the Tamarin prover), and the SAPIC plugin (tool translating high level protocols to multiset rewrite rules, analyzable by Tamarin)
  • Introduced the notion of biprocesses (semantics and translation) and diff equivalence in SAPIC, and worked on the soundness proof of the translation after the addition
 
 
 
 
 

Quantitative Research Analyst

Indian School of Business

Dec 2019 – Jan 2020 Hyderabad, India

Deep Learning : Applying NLP techniques to Time Series Analysis for Stock Futures :

  • Designed and implemented an intuitive approach to storing the history of a stock in the form of a vector using a Ticker Embedding Model, similar to that in a Word Embedding model
  • Incorporated a number of technical indicators such as Momentum, Trailing Volatility, Asset Class and average return across each asset class along with these embeddings for time series analysis
  • Designed, trained and tested an LSTM classifier (built using PyTorch) on a time series of multiple stock tickers to predict the Expected Return and to study non linearity and inter asset class correlation
  • Expanded the base LSTM to incorporate attention, and retrain over the latest data while testing
  • Optimized the hyperparameters using libraries: Ray for Grid Search and Hyperopt for Bayesian optimization
  • Awarded a Letter of Recommendation for exceptional performance shown throughout the internship

Trading Algorithms: Implementation in Python :

  • Worked towards developing, modifying and implementing PAIRS, Betting against Beta and Momentum trading algorithms on the Indian Stock market at the NSE Trading Lab

  • Calculated Beta by regression on the CAPM equation with a rolling 6 month window :

    • The strategy was implemented with daily, weekly and monthly rebalancing of the portfolio

    • Performed and anaylzed the difference in output on equal weighted and value weighted portfolios

  • Modified the PAIRS strategy on a rolling window of 1 year with 12 % CAGR and 0.71 overall Sharpe

  • Researched the intricacies involved in the trading strategies: Pitroski’s F-Score, Mohanram’s G-Score, Accruals, PEAD and Momentum crashes:

 
 
 
 
 

Data Analyst

Spencer Retail

Jun 2019 – Jul 2019 Kolkata, India
  • Statistical Analysis of transactional & brick level data of the underperforming stores, to understand and attribute reasons for de-growth, using Pandas, Sqlite and the various graph visualizations in Matplotlib
  • Given all the KPIs with respect to category, used deep dive into individual SKU level performance to come up with solutions to counter degrowth, in the MGF Gurgaon Hyper store and the Vizag Hyper store
 
 
 
 
 

Machine Learning Intern

Indian Institute of Technology, Kanpur

May 2019 – Jun 2019 Kanpur, India

Analysis of ML Algorithms for Spam Email Classification in Python :

  • Anaylzed KNN, Naive Bayes, SVMs and Neural Networks and finally implemented Naive Bayes and KNN for the classification of various data sets into using Keras, Pandas, Numpy and Scikit-learn
  • Compared accuracies for various data sets and categorised the best method for each data set
  • Awarded a Letter of Recommendation for exceptional performance shown during the internship
 
 
 
 
 

Software Engineering Intern

Citytech Software

Nov 2018 – Dec 2018 Kolkata, India
  • Configured and enhanced a chatbot for Paylite Leave Application, using Microsoft LUIS platform
  • Helped in introducing VOICE to TEXT feature (using Bing API) from Microsoft Azure
  • Research on Human Resource Automation and comparative study between LUIS, Google Dialog Flow and other developments in Google Assistant, IBM Watson, Alexa and Cortana

Projects and Key Assignments

For a detailed description, have a look at my CV/ Resume

*

Restaurant Management System

-> Created a GUI website application for a Restaurant Management System with cookie based login authentication
-> The application creates an ordering pipeline (placing an order, alloting to a chef, then served by a waiter) that simulates a real-world restaurant system
-> A customer can view his order history, recommended dishes, filter dishes based on cuisine and budget and place an order, chef/waiter can view profile and complete orders alloted
-> The owner can view and update inventory and employee information, allot orders to chef/waiters and can view analytics and graphs on top dishes, top employees and statistics on profits, expenditure and wastage (Analytics filterable on date ranges)
-> Used MVC architecture in NodeJS (Express), PostgreSQL, Bootstrap, ChartJS, html2pdf.js

Sclp compiler

-> Created a C-like compiler from scratch using lex and yacc
-> Implemented the scanning, parsing, Abstract Syntax Tree (AST), Three Address Code (TAC) and Register Transfer Language (RTL) stages for input programs with visibility of output of each intermediate stage
-> Input programs had assignments, functions, complex expressions and control flow structures
-> Ensured illegal tokens, syntax errors, semantic errors are reported

E-Shop

-> Built an E-Shopping Website where admins can add products, users can add items to cart, order and view thier order history
-> Not enough user credit scores leads to failure of execution of cart order -> Used MVC architecture in NodeJS (Express), PostgreSQL and Bootstrap

Web Crawler

-> Built a web crawler using Spark which starts with a URL, each URL having a new dataset of URLs to crawl
-> Used RDDs and transformations and outputted tuples of the form (url, indegree)

Pure Numpy Implementation of CNN

-> Implemented a CNN model with its Fully Connected, Convolution, Average and Max Pooling layers in pure Numpy
-> Trained the model on the MNIST and CIFAR10 datasets to achieve accuracies of 94% and 53% respectively

Buffer Overflow Attacks and Defenses

-> Demonstrated the Stack based and Heap based buffer overflow exploits along with their special cases: Return to LibC, Off by One and Use after Free using C and x86.
-> Performed a detailed case study on the Code Red Worm which was based on buffer overflow.

File System Implementation

-> Emulated a disk over a text file with the superblock, inode and data blocks.
-> Implemented a file system on the emulated disk with basic operations like open/close/read and write.

Copy-on-Write Fork in xv6

-> Implemented the Copy-on-Write Fork in the xv6 OS .
-> It allocates new pages for the memory image only on modification by the child/parent.

Custom Memory Manager

-> Implemented a memory manager to allocate and deallocate memory dynamically.
-> Extended the allocator to be elastic and map pages only on demand.

Scheduler in xv6

-> Modified the current scheduler in xv6 to consider user-defined process priorities.
-> Used priorities as weights to implement a weighted round robin scheduler, while taking care of starvation

Custom Linux Shell

-> Built own Linux-like shell in C.
-> Added support for background, serial & parallel processes, and kill signal & exit command

Sentiment Analysis by BERT

-> Achieved 91% accuracy in predicting positive/negative sentiments on the IMDB reviews dataset
-> Used BERT from the Hugging Face transformers library and Pytorch for preprocessing and funetuning the model

Distributed Spanning Tree Protocol

-> Simulated the network bridge topology as a distributed system of nodes, communicating via messages, in C++
-> Configured nodes to run the protocol and agree upon a loop-less logical topology to prevent a broadcast storm

SAT Solver

-> Designed a SAT Solver using z3 in Python, to check satisfiability in CNF (Conjunctive Normal Form)
-> Solved the NQueens and Sudoku problems with the designed solver, using DPLL(a backtracking algorithm)

Applying NLP techniques to Time Series Analysis for Stock Futures

-> Designed and implemented an intuitive approach to storing the history of a stock in the form of a vector using a Ticker Embedding Model, similar to that in a Word Embedding model, representing the meaning and context of the word. The ticker embedding hence represents the past movement of a stock and clusters similar stocks. The corpus was constructed using features: Price, Volume, Open Interest, Asset Class and Exchange
-> Incorporated a number of technical indicators such as Momentum, Trailing Volatility, Asset Class and average return across each asset class along with these embeddings for time series analysis
-> Designed, trained and tested an LSTM classifier (built using PyTorch) on a time series of multiple stock tickers to predict the Expected Return and to study non linearity and inter asset class correlation
-> Expanded the base LSTM to incorporate attention, and retrain over the latest data while testing
-> Optimized the hyperparameters using libraries: Ray for Grid Search and Hyperopt for Bayesian optimization

Betting Against Beta

-> Implementing the Betting against Beta trading algorithm based on the research paper by Andrea Frazzini and Lasse H. Pedersen. Beta calculated by regression on the CAPM equation
-> Dataset taken is NIFTY 50 stocks from 2002 to 2019 and implementing the strategy with daily, weekly and monthly rebalancing of the portfolio with a 6 month rolling window
-> Performed and analyzed the difference in output on equal weighted and value weighted portfolios

PAIRS

-> Worked towards developing, modifying and implementing the PAIRS trading algorithm on the Indian Stock market at the NSE Trading Lab
-> Modified the PAIRS strategy on a rolling window of 1 year with 12 % CAGR and 0.71 overall Sharpe

Google Form and Survey Management

-> Designed own Form and Survey Management system like the Google Forms with own user authentication
-> Allowed for modular design of questions (single and multi line, file upload, drop down, checkbox, radio button, rating scale and toggle) and form validation (can give constraints on each answer such as alphanumeric, numeric, range, email-ID, .pdf only), and added a feature of adding collaborators to your form
-> Developed shareable forms, useable as surveys and quizzes. Data acquired is analyzed by plotting of numerics (using Matplotlib), learning dependencies among responses and summarized presentation of subjective answers
-> Used Django for backend, Sqlite3 for the database structure, Bootstrap for responsiveness

PCA on MNIST data

-> Given the MNIST dataset, Principal Component Analysis was performed on the images of each digit to visualize their principal modes of variation about the mean (by fitting a MultiVariate Gaussian) in MATLAB
-> The number of principal eigenvalues were found, to decide on the number of degrees of freedom of each digit
-> Attributed reasons to why the number of significant eigenvalues are far lesser than total, and also concluded beahvioural patterns in writing digits based on the principal modes of variation

Fruit Image Generation & PCA

-> Principal Component Analysis was performed on RGB images of 100 fruits, and the closest representations were plotted, in MATLAB, using the mean and the four eigenvectors corresponding to the four most significant eigenvalues of the covariance matrix. A MultiVariate Gaussian was fitted on the entire dataset
-> New Fruit images were generated by random sampling, using the closest representations, which were distinct from any fruit in the dataset, but representative of the dataset

Efficient Memory Allocator

-> Designed a simulator in C++ for the efficient dynamic allocation of memory to a large number of processes
-> Utilized the first-fit strategy to decide the locations at which memory should be allocated
-> Handled allocation, deallocation and termination requests for upto 10^6 requests simultaneously

Image Reconstruction and Compression

-> Transformed distorted images by cleaning out noises such as salt and pepper noise using Numpy & Scipy
-> Used KMeans++ algorithm to flatten out coloured images across several K values to get the Enhanced Image

Non Parametric Estimation & Cross Validation

-> Compared various non paramteric estimation techniques like histogramming and Kernel Density Estimation and anaylzed the rate of convergence and their optimum value
-> Implemented the Cross-Validation procedure in MATLAB by finding out the bandwidth parameter which gives the maximum joint likelihood and a minimum deviation between the empirical and the actual PDF

Spam Email CLassification

Please click on the link to view my project during my internship at IIT Kanpur under Prof. Vipul Arora (mentioned in the Experience Section)

Screen Printing

-> Investigated Rheological properties of Carbon paste and analyzed Cyclic Voltametry for three electrode system
-> Extensively made a three electrode system cheaper then any other in the market using carbon and silver paste

Featured Publications

Quickly discover relevant content by filtering publications.

ChiSquareX at TextGraphs 2020 Shared Task: Leveraging Pretrained Language Models for Explanation Regeneration

In this work, we describe the system developed by a group of undergraduates from the Indian Institutes of Technology for the Shared Task at TextGraphs-14 on Multi-Hop Inference Explanation Regeneration (Jansen and Ustalov, 2020). The shared task required participants to develop methods to reconstruct gold explanations for elementary science questions from the WorldTreeCorpus (Xie et al., 2020). Although our research was not funded by any organization and all the models were trained on freely available tools like Google Colab, which restricted our computational capabilities, we have managed to achieve noteworthy results, placing ourselves in 4th place with a MAPscore of 0.49021in the evaluation leaderboard and 0.5062 MAPscore on the post-evaluation-phase leaderboard using RoBERTa. We incorporated some of the methods proposed in the previous edition of Textgraphs-13 (Chia et al., 2019), which proved to be very effective, improved upon them, and built a model on top of it using powerful state-of-the-art pre-trained language models like RoBERTa (Liu et al., 2019), BART (Lewis et al., 2020), SciB-ERT (Beltagy et al., 2019) among others. Further optimization of our work can be done with the availability of better computational resources.

Skills

For an exhaustive list, do have a look at my CV/ Resume

Python

Java

Databases

Analytics

Algorithms

HTML

CSS

Java Script

Contact

  • Hostel 5, IIT Bombay, Mumbai, Powai 400076
  • DM Me