OCR with Deep Learning

by Krishna Vishal Vemula

There has been good amount of discussion about OCR lately. Since existing open-source OCR solutions doesn’t provide the level of accuracy and training feasibility like their commercial counterparts we have started to look towards Deep Learning based solutions and we have found some interesting works.

Calamari − A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition

The authors have provided code to replicate the results in the paper, they aim to provide a more robust open-source OCR solution. It is designed to both be easy to use from the command line but also be modular to be integrated and customized from other python scripts.

Calamari OCR – GitHub repo

Calamari Paper – Paper

Calamari in its present form expects input as an image of text line, so the input page has to be segmented into individual lines before being fed into calamari.

Text Line segmentation using Deep Learning

Couple of deep learning based solutions using fully convolutional neural networks have been proposed. The results provided in the papers are promising.

Paper 1

Code for Paper 1

Paper 2

Calamari combined with text line segmentation has the potential to be more robust than existing open-source OCR solutions.


by Srikumar Subramanian

Spatial is a high level language for accelerators like FPGAs. See the Hello Spatial tutorial for a view of the language.

UTXO to account conversion

by Vishwas Bhushan

We are researching on how to convert UTXO to account model and vice versa also how to implement smart contract on UTXO based model.

We are taking two different approaches for it:

From practical perspective:

  • We explored Stellar, we tried setting up its network (with couple of nodes) and tried to understand how SCP works (basically how to setup its configurations) - this is what we have done.
  • Secondly, we wanted to understand, how to write smart contracts on Stellar - basically, in Stellar there no such coding mechanism to write contracts, rather here smart contract means - how to form combination of multiple transactions so that multiple parties work together (essentially a multi signature contract). We took Escrow contract as an example for this. Here is an implementation.

From theoretical perspective:

  • We wanted to explore how do we do UTXO to account (vice versa) translation? Is it even possible ? kind of questions. This work is still in progress but here and here is short note on it.
  • Since we have stellar network ready and we know how to write smart contract on it, next we want to try this translation on Smart contract level (this we haven’t achieved yet.)

Blockchain development from scratch

by Vishwas Bhushan

We are looking into Stellar and parallely keeping eye on how to develop our own blockchain either by forking one or by developing from scratch. While reading Stellar white paper, I am getting a lot of keywords related to our research work. One of them is Tendermint.

Tendermint core is BFT middle ware that takes a state transition machine - written in any programming language - and securely replicates it on many machine - blockchains. It is low-level engine based on BFT protocol and its being used as a development kit for building blockchains. Other doc related to Tendermint is here. Paper related to BFT is here.

Some notes on Tendermint:

Tendermint consists of two chief technical components: a blockchain consensus engine and a generic application interface. The consensus engine, called Tendermint Core, ensures that the same transactions are recorded on every machine in the same order. The application interface, called the Application BlockChain Interface (ABCI), enables the transactions to be processed in any programming language

What blockchains are built on Tendermint so far?

A lot of them. for example, Hyperledger Fabric, Hyperledger Burrow, Cosmos, Ethermint etc.

Article: Mobile designers will shape the future of 3D application design

by Srikumar Subramanian

“The first great mobile AR experiences will be built by people who already know 2D design tools and workflows.”


There are five major challenges that make up the Wall of Pain.

  1. Learning new tools and developing workarounds.
  2. Cumbersome communication between designers, developers, and other stakeholders.
  3. Obtaining and creating 3D assets.
  4. Rapid prototyping that’s anything but rapid.
  5. The challenges of sharing work.


by Srikumar Subramanian

Generating random numbers in a deterministic virtual machine like Ethereum is a hard problem. RANDAO is a “distributed autonomous organization” that serves up random numbers provided by its contract users. The gist is that it invites users to pledge an amount and provide a hashed secret random number, and after a period reveal the secret number. The random number itself is generated from a collection of random numbers during a round. See the github repo of RANDAO for details.

A survey of text clustering algorithms

by Srikumar Subramanian

A survey of text clustering algorithms by Aggarwal and Zhai (link to pdf, original researchgate source).

Covers TF/IDF, Latent Semantic Indexing, non-negative matrix factorization, distance based and hierarchical clustering, hybrid scatter-gather method, word and phrase based clustering, co-clustering words and documents, graph clustering, information theoretic approaches, LDA (latent Dirichlet allocation) based topic modeling, online (streaming) clustering and semi-supervised clustering.

Universal Dependencies

by Srikumar Subramanian

Universal Dependencies is a project that is developing cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective.

Avoiding nausea in Virtual Reality

by Srikumar Subramanian

VR tech’s Achilles’ heel has been the tendency to leave users giddy or nauseous at the end of an experience. If you want folks to have a pleasant experience, you need to stick to the following design principle -

NEVER perform any camera movement in the VR space that isn’t also a movement done by the user.

In practice, this means a) you can do head tracking turns, b) you can do manipulative actions on the scene objects, c) you cannot set the camera in linear or other motion, especially accelerating motion, even if it is in response to a controller event. Make sure what your vestibular system knows and what your eyes see always agree with each other.

This is a severe constraint, but if you absolutely need the maximum reach, there is no way around it that we know of so far … unless you have gravity warping technology.