DeltaX - HackDay 2023
This was the 4th edition of the annual Hackathon at DeltaX. This year we got the highest number of teams so far.
This was the 4th edition of the annual Hackathon at DeltaX. This year we got the highest number of teams so far.
In this blog post, I will explore the HttpClientFactory
feature that was introduced in .NET Core 2.1 and also the shortcoming of HttpClient
class, will discuss how HttpClientFactory
can help us improve the performance and reliability of our HTTP-based communication with external services. Implementation of Polly is outside the scope of this post.
In advertising it's hard to analyse how much return we are getting through the marketing channels on which the advertising budget is spent. There's a concept called Marketing Mix Modelling(MMM) which tries solving this particular problem. This blog takes you through an R package - Robyn which helps us to easily implement MMM on our data by automating the tedious statistical/mathematical details in building the model.
System.CommandLine is a library provided by Microsoft which eases out the functionality needed by CLI. It provides a way to parse command-line input and display help text. It will also ensure that inputs are parsed in accordance with POSIX or Windows conventions.
In this article, we’ll take a look at how to publish a dotnet package to an DeltaX Artifacts feed using NuGet.
How we stumbled upon Git submodules for one of our features and the implementation behind it. Know more about using git submodules for your projects.
3rd edition of the annual hackathon at DeltaX. This year we broke the stereotype that only engineers can be a part of hack days - Product, Growth and QA teams joined our Engineering team to create magic over a 24hr hacknight!
Second edition of the annual hackathon at Deltax. But this time, a twist - Hack from home. Checkout this article to see how WFH did not stop DeltaX from brainstorming and creating amazing hacks.
A Practical Guide to creating A/B and Multivariate Tests
I came across AutoML while exploring some data science use cases we were tackling and hence I explored it for some time. Here are some notes related to Google Cloud - AutoML from a developer perspective. At the tail end shall also share some thoughts around how this ecosystem is shaping up.
Centralized alerting for the service and infra stacks using ElastAlert.
Under the cosmos of all the development that we do lies a hidden feature, do you want to know more about it? Check this post out!
24 Hours, 12 Teams, 4 Winners at our first internal hackathon. Like a typical hackathon, the idea was to collaboratively code in an extreme manner where participants start from scratch and end with a working prototype. We were open to all kinds of ideal - building something innovative, creative or simply scratching your own itch. There were no holds barred.
A Proxy Server is a server which acts as an intermediary for requests from a client seeking resources from other servers. Learn and understand how DeltaX uses IIS ARR as a proxy server.
One fine morning, Amrith pinged me there is an issue with Login on the iOS DeltaX mobile app. I was curious to know why this is happening and how should I go about fixing it. Since this was also the first time I was dealing with login module on DeltaX. The real cause of the issue was cookies being blocked by Safari and hence session cookies were not being sent on future requests. We decided to migrate our iOS login to use DeltaX API with JWT to avoid the cookies issue. There was no ready to use .NET MVC module for JWT and hence we decided to write a custom validator for the JWT token generated by the DeltaX API.
Let me take you through some of the basics of JWT.
I've been yearning to explore GraphQL for some time now and a daytime sleeper bus journey turns out to be the best excuse. I shall take a quick dive into it's feature highlights also share some samples on how it can solve some real challenges for us.
In school and college examinations, final marks are obtained by summing up all the marks obtained for individual questions. However, standardized tests like GMAT, GRE have adaptive systems which make it easier to obtain average score but harder to obtain a perfect score.
It was a warm morning, on the 14th of January, I remember because it was the 1st time I got a glimpse of what a corporate life looks like. I was excited when Akshay gave us an overview of what we will be learning. Starting with Vue and moving on to Ionic 1 in the 2nd month of my internship.
In Spring of 2016 at Microsoft Build, Microsoft announced Bash on Ubuntu on Windows which enables native Linux binaries to run on Windows via Windows Subsystem for Linux (WSL), with WSL anyone can run Linux distros (Linux distribution) from Microsoft Store in minutes. This quickly begun to use by Millions of Windows 10 users and got improvement with every Windows Semi-Annual release.
We all know how important tests are for any application, it ensures that we build a bug free application and also helps us to avoid introducing bugs to existing functionalities as we make changes to it. In this blog, I’ll briefly go through how we can design a .NET Core API using behaviour driven development with specflow.
Users are exposed to different channels on their way to conversion and advertisers keep track of these journeys and want to know the effectiveness of a channel. In this blog we will take a look at how Shapley Value method can be used for attribution in digital advertising.
Considered one of the quirkiest things in javascript is simple as hell. This article tries to elucidate this
and clear out it’s elusive patches that I encountered during a recent face-off with javascript.
When Domain Driven Design (DDD) and Behaviour Driven Development (BDD) were chosen to be used for the implementation of a Wallet Service, the two concepts were completely new to me. As I spent the next week going through articles and blogs, basically spending a lot of time on Google, on what exactly these two are it was a bit challenging to understand both of them, let alone combine them for a better development process. But as the week progressed, I focused solely on understanding what each of them are individually first. You can find an overview of BDD here. Here, we are going to look at what DDD is on it’s own and how it blends with BDD.
An introduction to what is Behaviour Driven Development and how crucial it is in software development. In this blog post, I’ll be briefly comparing TDD to BDD, going through a few good practices in writing BDD scenarios and how we use BDD at DeltaX.
Over the past year, I’ve had the chance to work on Ionic extensively. In this post, I would like to share my experience working on Ionic and discuss some of the major differences between Ionic v1 and Ionic 3.
I’ve recently been working on a new project, which involved creating a new web app that included a conversational interface. We decided to build the interface using Microsoft’s bot framework SDK. In this blog I’ll be discussing the basic concepts involved in this framework and how we can integrate it into any web application.
An Introduction to Amazon Comprehend with details of Entity Recognition using Comprehend in .NET projects
We are big fans of Amazon ES Service. Recently, AWS has enabled one-click in-place upgrades with Amazon ES service on your existing domains
For the last 4 weeks, I was working on building a REST API for LeadX. LeadX is a mobile-first CRM and considering is a product we are building from scratch we were able to use current set of best practices and also apply our previous learnings from the DeltaX platform. We used .NET Core for building the API and as part of this post I plan to discuss how we implemented API versioning - along with what it is and why it’s matters?
"Radar" is our monthly digest which features links that our engineering team found interesting
"Radar" is our monthly digest which features links that our engineering team found interesting
Optmizing Entity Framework memory usage to bring down service memory usage from 90% (> 8GB) to negligible. Most of the core services needed meaty VMs that had more than 8GB RAM. And these services still under-performed.
The story started when I saw this ping on the Hyper-V channel.
[03/07/2018 1:11 PM] Ketan Jawahire: When service takes more than 90% of VMs memory then it becomes really difficult to work with that VMs.
PS takes minutes even to single simple statement. 😫
[03/07/2018 1:29 PM] Suneel P: changed all Ad Data Download VMs memory config to Azure config
[03/07/2018 1:29 PM] Ketan Jawahire: I am going to install it on some azure VM & check for log files. Its really difficult to use PS with VMs using 90%+ memory
[03/07/2018 1:30 PM] Suneel P: try using Ad data download vms - 23,28,32,38. they have enough ram now
Here’s our story of a clever hack to reduce the memory footprint of EF’s cache objects (mapper objects, etc.).
"Radar" is our monthly digest which features links that our engineering team found interesting
Headless Chrome started shipping with Google Chrome from version 59 onwards. It brings all modern web platform features provided by Chromium and the Blink rendering engine to the command line; hence opens up quite a few possibilities. This post discusses how we use it as part of our service monitoring bot - Heimdall
"Radar" is our monthly digest which features links that our engineering team found interesting
The story of how I hacked together a simple hiring portal.
Moving to git hasn't been easy. We, at DeltaX, moved from SVN source control over 6 months ago now. I write this to give an overview/early reaction on how it has been working for us.
At DeltaX we have been using Amazon Athena as part of our data pipeline for running ad-hoc queries and analytic workloads on logs collected through our tracking and ad-serving system. Amazon Athena responds anywhere from few seconds to minutes for data than runs into hundreds of GBs and has pleasantly surprised us by its ease of use. As part of this blog post, I shall discuss how we went about setting up Athena to query our JSON data.
AWS Elasticsearch has its woes which are widely publicized; this blog post discusses reasons why we use it and probably why you should also consider
Tracking pixels also referred to as 1x1 pixels is a common way to track user activity in the analytics and adserving world. Overall, the tracking pixels are flaky and costrained by the limitations imposed by various browser environments and network connectivity. The Beacon API proposes to address these concerns and to provide a streamlined API and predictable support across browsers.
The Big Data ecosystem has grown leaps and bounds in the last 5 years. It would be fair to say that in the last two years the noise and hype around it have matured as well. At DeltaX, we have been keenly following and experimenting with some of these technologies. Here is a blog post on how we built our real-time stream processing pipeline and all it's moving parts.
A very detailed introduction to elasticsearch; covering all its important aspects. It includes details about clusters, nodes, shards, indexes, inverted-indexes and segments.
Redesigns are not easy. With over 5 years of emotions attached to our old identity, we knew this was never going to be an easy task. I must admit we tried undertaking this endeavor a year back and failed to see it through. We had our fair share of learnings from that experience. What we were certain of is that a lot has changed in the last few years - within ecosystem, for us as a company, and for the partners that we work with. Our identity needs to reflect this and at the same time inspire us to lead the change.
[29/09/17, 1:32:45 AM] Amrith Yerramilli: (*) (*) (*) !!!
https://production1.adbox.pro/App/AdServer/Ads/New/1501
Holy f***! what a difference
[29/09/17, 1:32:53 AM] Amrith Yerramilli: HTTP2 !!
[29/09/17, 1:33:01 AM] Amrith Yerramilli: I’m literally screaming
My conversation with Akshay a couple of weeks ago.
Looking into providing solution to bigger problems by tackling smaller instances of the same problem
Using CDNs (Content Delivery Network) for static content has been a long known best practice and something we have been using across our platform and ad-server. I wanted to share a special usecase where we use CDN (AWS Cloudfront) for serving dynamic requests on our ad-server to achieve subsecond response times.
A simple introduction to VueJs
At DeltaX, we have multiple use-cases of Redis.
This article talks gives a little background on how we latched onto Redis, some gotchas, and some free advice :)
As a web developer, we often have to work with different JavaScript frameworks on a regular basis. In this article I will briefly explain my experience with KnockoutJS library and then talk over specifics around some of the components of KO : Observables, Dependent Observables and Templates.
Can we create a model that could predict whether an ad would get clicked or not ?
Today happens to be my last day of the internship - and so, is a good time to pause and ponder of the weeks that flew by. As part of this blog post, I plan to share what I worked on, learnings and my overall experience.
The idea is to pick a project, get something done and have a presentable result within one month. The project could be anything: a software project, a hardware project, a book review, an article, ...As long as there is an outcome and there is some learning involved
There are times when we have multiple iframes on a page and all of them need to access/modify some data/information on the parent page, while also needing the change to happen as soon as the page load without any delay, for an ad serving platform being able to achieve this is one of the primary requirements.
I have been trying my hand at angular 2 applications as part of my learning and thought of sharing my thoughts on my initial experience.
What is attribution and attribution modelling? What is the difference between single-touch and multi-touch models? Which model you should use?
Functional programming is a programming paradigm—a style of building the structure and elements of computer programs—that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data. It is a declarative programming paradigm, which means programming is done with expressions or declarations instead of statements
In the second half of 2016 - we decided to migrate our multi-tenant app from bare-metal servers to Azure. While you can find numerous benchmarks for various cloud platforms - there are very few relatable drill downs on the thought process as part of such migrations to the cloud as is. More importantly this was not just a migration - it was literally a war with all hands on the deck; keeping the existing usage, client data and growth intact we were able to migration over 1.4TB data and existing clients to the cloud successfully. I thought this story needed to be told and so I did.
There. I’ve said it - I am addicted to the REPL.
Looking back, it has been one of the best learning tool for me - today I know it’s called the REPL
Asynchronous programming is the approach of creating code that could execute in parallel and does not have to wait for an action to complete before moving on to the next action, The ideology behind this is to have actions that are independent of other actions and hence executing them in sequence would be a waste of precious time. Keeping this programming paradigm in mind, Node.Js was created.
In Digital Advertising attribution is the problem of assigning credit to one or more advertisements for driving the user to the desirable actions such as making a purchase. This post discusses how one can model this process and its impact on budget allocation.
When designing architecture for mission critical systems the two most commonly discussed aspects are scalability and availability. Most often than not both aspects are used interchangeably. Scalability is about being able to handle increasing load while availability is keeping the system operational by decreasing downtime. Designing Highly Available systems is focusing on the qualitative measures to reduce downtime and eliminating the single point of failures (SPOFs). Here are some learning and thoughts on things to consider while architecting an HA system.
Advancements by cloud-based IAAS providers (Amazon Web Services, Google Cloud and Azure have made on-demand scale and flexibility a reality. Today, as a startup you don't need to worry about over-provisioning resources, forecasting growth in infrastructure and go over long-term infrastructure contracts to meet your demands. Interestingly, a new suite of cloud services are challenging the core aspect of common application architectures - the 'server' and are termed as `serverless`.
Transcoding is the process of converting a media file from one format, resolution, quality and specs to another. In the past, a transcoding pipeline would require a lot of heavy lifting on the software and hardware front. Today, using the cloud you can setup a transcoding pipeline in a matter of minutes.
Can we calculate how long does it take for one to respond ? _**TL;DR**_ : In short yes, we can use probability theory to quantify our response times.
Every now and then, we realize we need to go back to the basics of building websites (read as web apps).
Yes, FEO is a real term - Front End Optimization.
TL;DR : We’ll take a quick look at how we moved to using CDNs to serve our static assets. Oh yeah - it is that simple.
In this post we want to discuss about feedback control and how we can use this concept in practice to stabilize the system. We will take a look at a case study and see how we we can use simple techniques to control the metric of interest.
If you believe in 'Manners maketh Man'; then you would agree when I say 'Indexes maketh the Query in SQL'. In this post, I plan discuss the most basic types of indexes available - clustered and nonclustered. I also plan to show with examples on how they work, individually and together and compare each case to the real world.
In the earlier blog post we discussed about sponsored search marketing and the mechanics of auction design. In this post we shall look deeper into the challenges of online keyword advertising auctions among multiple bidders with limited budgets, and try to come up with the bidding strategy that will increase the expected utility of the advertiser.
Setting prices for a sealed auction by the search engine for different queries is pretty complicated. One possibility is simply to post prices, the way that products in a store are sold. But with so many possible keywords and combinations of keywords, each appealing to a relatively small number of potential advertisers, it would essentially be hopeless for the search engine to maintain reasonable prices for each query in the face of changing demand from advertisers. Instead, search engines determine prices using an auction procedure, in which they solicit bids from the advertisers. There are multiple slots for displaying ads, and some are more valuable than others. As part of this post we will discuss the various dynamics for Auction Mechanism Design for Sponsored Search.
Since node.js applications are single threaded we can create multiple processes to listen to the same port, and make use of all the CPU cores that are available in our computers. To simplify this, node comes with a module called cluster. This module has a number of handy functions to create workers and monitor them. It also lets us send and receive messages from processes (similar to Window.postMessage() in the browser).
One of the limitations of this module is that a worker is not aware of any other worker and can only send messages to the master process. If one worker has to send a message to another worker, the message must first be sent to the master which then forwards it to the actual recipient.
This results in a lot of boilerplate code that’s repeated in a number of applications. Pseudo code to broadcast a message:
// Workers send the message to the master
function sendMessage(message) {
process.send(message)
}
process.on('message', function(message) {
if (process.isMaster) {
// In master; forward message to all workers
forEach (worker in workersList) {
worker.sendMessage(message)
}
}
else {
// In worker
switch(message.code) {
case 'code1': function1(message); break;
case 'code2': function2(message); break;
}
}
})
You’ll have to write additional code to keep track of the sender’s process id, send replies to just one worker and such stuff which are pretty easy to do, but should really be a part of the library itself. So I went ahead and made a library that acts as a wrapper around the underlying cluster
module and helps reduce boilerplate. :p The library provides functions called messageSiblings
, messageWorkers
etc. which do exactly what their names say.
The library can be found on npm under the MIT license. Feel free to fork it and add any functions you need. :)
As part of the revamp of the DCO engine, we have been adding support for quite a few highly customizable scenarios for dynamic creatives - including storyboarding, geo-location, weather etc. If you haven’t seen them yet then you should give them a dekho - story-board demo, e-commerce demo and geo-loaction demo.
Considering we allow highly customized dynamic creatives, one of the challenges we faced was the trade-off between making our algorithms generic enough vs. flexibility to accommodate advertiser use cases.
Let’s a take an example for an Advertiser ‘A1’ who wants to show different creatives based on the current weather - sunny, cloudy, snow or rain. Another advertiser ‘A2’ who would want to show different creatives based on the current temperature - cold, pleasant or hot. Here is how these algorithms would look like in pseudo-code.
The unexpected side effects of converting a single threaded service into a multi-thread, multi-instance service.
We’re in the middle of one of the most critical migrations - moving to the cloud. One of the most frequently used terms about this shift is scale : the ability to run mutiple instances of something, without worrying about the operational overheads.
During this migration, we are looking at ways of parallelizing pretty much every background service. One such service is our External Clicks worker. Well since we were in a hurry and we needed to migrate ~500GB of data to the new servers, we decided to run multiple instances of this worker.
All was well. Well, almost.
As someone whose only experience with asynchronous functions was jQuery’s $.ajax
I was in for a treat when I started working on a project using node.js. Functions that took functions as arguments? “That’s okay,” I told myself, “I’ve used functions like map
and filter
.” Little did I know what a mess this could quickly turn into!
Finding the optimal distribution of budget among different ad groups so that one can optimize their business objective is a challenging and time-consuming task. One will have to constantly monitor changes and make decisions accordingly or can leave it to an intelligent system which looks at multiple features, trends for the data points and then comes up with suggestions.
We have been using Node.js rather successfully on the DeltaX Tracking and Ad-serving side of things with it churning >10K requests/min without breaking a sweat. The inherent asynchronous event-driven nature of Node.js helps us keep the latencies low and the memory footprint small. 1
In recent times, Go (aka. golang) has come close to challenging Node.js with regards to building light-weight high-performance micro-services and also brings own set of advantages to the table. 2
This day, four years ago is when this journey began. I must admit that it took longer than expected to start this blog but happy that we have finally taken the plunge.
It’s pretty easy to underestimate the power of small learnings while you are trying to make things work and get things rolling. Hoping this blog acts as a journal celebrating small learnigns and big achievements.
So, here we are starting with the customary hello world
.