Hi, I'm Ben Koska

I am an AI researcher passionate about advancing deep learning.

About me

What are your main research interests?

I focus on AI kernel optimization, large-scale distributed trainings and small (language) models for on-edge computing.

What do you enjoy beyond research?

I love diving into aerospace engineering concepts, tinkering with simulations, and brainstorming how AI can redefine the future of flight and space exploration.

Career

SF Tensor

San Francisco, USA

Co-Founder2025 - Current

I co-founded SF Tensor, a startup working to build the future of high-performance computing.

High Performance Computing

KOSKA GmbH

Vienna, Austria

Head of Software Engineering2022 - 2025

I headed the software engineering department at KOSKA, as well as several artifical intelligence research projects.

ManagementSoftware EngineeringBig Data

Stealth Compiler Startup

London, England

Co-Founder & CEO2023 - 2024

I co-founded a stealth Tech startup and worked as CEO there, while I oversaw the development of the product.

LLVMCompilerC++

Stealth FinTech Startup

London, England

Co-Founder & CTO2020 - 2021

I co-founded a stealth FinTech startup and worked as CTO there, while I was working there I oversaw the development of the product and hiring of software engineers.

FinTechManagementSwiftKotlinTypescript

Scribbly

Boston, Massachusetts, United States

Software Engineer2018 - 2020

I worked as a Software Engineer focused on developing a Machine Learning-based news analysis system for a security platform, an incident reporting tool, and a web app for electronic business cards.

News AnalysisArtificial IntelligenceSwiftIncident ReportingSecurity

Publications

Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small Language Model

2nd International Conference On Foundation And Large Language Models

AuthorNov 2024

We present a novel 4.5B parameter small language model that can handle multiple input and output modalities, including text, images, videos, and audio. Despite its small size, the model achieves near state-of-the-art performance on a variety of tasks, demonstrating the potential of multi-modal models to tackle complex real-world problems. Our approach leverages recent advancements in language modeling and multi-task learning to create a versatile and high-performing model that can even be deployed for edge inference. Experimental results show the model's strong performance across multiple benchmarks, paving the way for further progress in multi-modal artificial intelligence.

Artificial IntelligenceMulti-ModalSmall Language Model

Simulated Realities are all you need: Synthetic Data for Enhanced Depth, Localization, and Object Detection in egocentric views

Under review

AuthorNov 2024

Augmented reality (AR) applications require precise spatial understanding to facilitate accurate mapping, object detection, and scene comprehension. However, real-world data for training such tasks is often difficult to obtain - and especially difficult to label - at scale and lacks the diversity needed for robust model performance. This paper introduces a synthetic data framework to advance AR capabilities, leveraging simulated environments with realistic lighting, textures, and complex scene interactions.

Synthetic DataDepth EstimationLocalizationObject Detection

Volunteering

Austrian Excellence Society

President2024 - Current

We are an association that supports Austrian students to complete outstanding projects and secure a spot in future economies.

Education

Get in Touch

Collaborate

colab@benkoska.com

Say hello

hello@benkoska.com