Sahajraj Malla
Inventor of MallaNet, Student at Kathmandu University, Data & Research Scientist
“The idea for MallaNet came from a very personal space. The name itself - derived from my surname ‘Malla’ and ‘Net’ for neural network - felt like putting my own stamp on it,” says Sahajraj Malla, the inventor of MallaNet.
Malla developed MallaNet, an AI model capable of recognising handwritten Devanagari script, to assist people who struggle with handwritten documents, old records and everyday forms. As someone with mild dyslexic traits, he found it difficult to read the visually rich but complex Devanagari script. His goal was to build a tool that could convert messy handwriting into clear digital text, making information more accessible, not just for himself, but for many others facing similar challenges.
“I built MallaNet as a convolutional neural network with branching to capture both large- and small-scale features, residual connections to enable deeper learning without losing information, and specialised capsule layers to preserve spatial relationships between character components,” Malla explains. “Devanagari has complex curves, joint letters and marks placed above or below characters. This architecture helps the model understand those intricate relationships better than standard designs.”
According to Malla, training MallaNet took several hours across multiple runs using Google Colab with a T4 GPU. He used PyTorch as the primary framework due to its flexibility and suitability for rapid experimentation. Key preprocessing steps included normalising images for consistent size and brightness, centring characters correctly, and applying augmentations such as slight rotations and shifts. These steps made the model more robust to real-world handwriting variations.
“To debug errors, I analysed confusion matrices to identify common misclassifications, visualised activation maps to see what the model ‘noticed’ in failing cases, and ran targeted tests on problematic characters,” he says. Based on these insights, he fine-tuned the network architecture and adjusted data augmentations accordingly.
One of the biggest challenges in building MallaNet was the sheer variability in handwriting. Individuals write the same character in dramatically different ways, and some characters differ only by minor strokes or dots.
Malla also aimed to keep the model computationally efficient. He addressed these challenges through extensive experimentation, refining multi-scale feature extraction and training on thousands of real handwritten samples.
“There were plateaus where the model consistently confused very similar characters, which slowed progress,” Malla notes.
Characters that differ only by small curves, dots or stroke placement - especially consonants with similar shapes and varying matras (vowel signs) - proved particularly difficult. Refining how features from different scales were merged, and carefully weighting those features, significantly improved the model’s ability to distinguish these challenging cases.
Another key improvement came from replacing complex capsule filters with a more homogeneous capsule filter. According to Malla, this preserved fine stroke details without blurring, which is essential for distinguishing visually similar characters. The change also helped achieve high accuracy with fewer computational resources.
MallaNet has wide-ranging potential applications. It could help digitise old Nepali manuscripts, automate government form processing, speed up archival work, or enable instant reading of vehicle number plates. It can also support automated grading of handwritten exams and power educational applications. For people with visual impairments, MallaNet could enable text-to-speech tools, while for those with reading difficulties, it enhances accessibility and inclusion.
Malla also believes that with fine-tuning, the model could be adapted for other scripts such as Bengali, Tibetan or Thai. “The core ideas for handling complex strokes and spatial relationships transfer well across these scripts. It could play a role in preserving and digitising regional languages,” he says.
Looking ahead, Malla plans to further enhance MallaNet by adding attention mechanisms to better handle joint letters, optimising the model for mobile devices and expanding it to recognise full handwritten sentences rather than individual characters.
“If I were to rebuild MallaNet today, I would integrate modern transformer architectures for better contextual understanding, train on much larger and more diverse datasets, and optimise it for real-time performance on smartphones,” he concludes.
''MallaNet was created to make handwritten Devanagari scripts easier to read and understand. Sahajraj Malla’s personal experience with difficulty reading Devanagari - and witnessing others face the same challenge - motivated him to build an AI system that makes handwritten Devanagari more accessible and inclusive.''
