MedGemma Medical Classifier
February 25, 2026
Kaggle MedGemma Impact Challenge entry. Fine-tuned MedGemma 4B with QLoRA (runs under 5GB). React Native Android app for scan upload, structured reports, and urgency classification.
What is it?
This was my entry for the MedGemma challenge. I fine-tuned Google’s MedGemma 4B model and then wrapped it in a mobile app so the result felt like an actual product instead of just a notebook demo. The app lets users upload a scan and receive a structured response with findings, impression, and urgency classification.
What I cared about here was not just whether the model could answer. I wanted the output shape to be useful enough that a frontend could rely on it cleanly.
How it works
The full flow is straightforward. The mobile app selects an image, sends it to a FastAPI backend, and the fine-tuned model returns a structured response. Instead of dumping plain text, I ask the model for fields that the UI can render directly, like findings, an overall impression, and an urgency level.
That matters because medical-style interfaces need structure. A frontend should not have to guess where the diagnosis starts or whether a sentence means urgent or not urgent. I wanted the backend to return something the app could render with very little interpretation.
Under the hood: MedGemma
MedGemma is useful here because it already starts with domain knowledge. It is not a general model that I am trying to force into sounding medical. It has medical pretraining, so the fine-tuning step is more about adapting its behavior and output format than teaching medicine from zero.
I used QLoRA again because it keeps the training practical. The base model stays memory-efficient, and I only train small adapter layers on top. That makes it possible to experiment on much smaller hardware while still getting a meaningful domain-specific result.
React Native with Expo
I built the client in React Native with Expo because I wanted the app layer to be fast to ship. The app can pick an image from the user’s device, convert it into something the API can accept, send the request, and render the model output as a structured report.
Expo was a good fit here because I did not need a lot of custom native code to get moving. It gave me a practical path to build, test, and package the app without turning the mobile layer into a separate infrastructure problem.
Structured output from medical LLMs
One of the biggest lessons here was that free-form text is not enough if the result is supposed to plug into an interface. If the model responds differently every time, the UI becomes fragile. So I pushed the model toward structured JSON output and designed the interface around fields instead of paragraphs.
That made the whole system feel much more solid. The frontend can render urgency as a badge, findings as a list, and the summary as a dedicated section. Once the output became structured, the difference between a rough prototype and a usable application became very obvious.
Key takeaways
- MedGemma: medical foundation model capabilities, fine-tuning on domain-specific imaging data
- QLoRA for multimodal models: image-text pair dataset formatting, adapter training
- Expo EAS Build: cloud APK compilation without Android Studio
- Structured LLM output: JSON mode, output schemas, parsing for downstream rendering
- React Native expo-image-picker and expo-file-system for camera roll to API workflows