[Recap] on Audio. Creating Speech rAIter micro SaaS

[Recap] on Audio. Creating Speech rAIter micro SaaS

September 12, 2025

Tl;DR

Doing TTS/S2T with streamlit and the st speechraiter was fun.

Now its time to build sth similar with Flask.

You can get started with this kind of project like so.

Intro

Recently I heard about myminutes.ai so summarize meetings.

What can we build around audio/speech?

Do you imagine practicing job interviews in front of an ai?

Some kind of way to LandThatJob

Or practicing that very important presentation with a SpeechPractice service.

The Speech Rater Stack

Previously I made a PoC streamlit version:

Choosing from Streamlit to Flask vs FastAPI 📌
I just went with FastAPI as per gemini and this doc.

I just went forward with cursor and fastAPI:

Fast API

And quickly made that simple UI, connected to OpenAI TTS/S2T capabilities.

With such architecture, see mermaid

As you can see on the video, where I show how it works while doing BiP, this already helps with my quick youtube video creation.

⚠️
The audio record does not work oh phones. Probably due to permissions.

Hey, what about the speech rater stuff?

Simple md Editor

Later, on I added simple markdown edition capabilities (there were few candidates)

  1. Monaco Editor (VS Code Editor) - RECOMMENDED (and this is the one cursor went for, enough for a quick edit) ⭐ Pros: Full VS Code experience, syntax highlighting, IntelliSense, built-in markdown preview Cons: Larger bundle size (~2MB) Best for: Professional editing experience

  2. CodeMirror 6 - LIGHTWEIGHT ⭐ Pros: Lightweight, fast, good markdown support, customizable Cons: Less features than Monaco Best for: Balanced performance and features

  3. SimpleMDE (Markdown Editor) - SIMPLE Pros: Very lightweight, live preview, easy to use Cons: Less advanced features Best for: Simple editing needs

  4. Toast UI Editor - MODERN Pros: WYSIWYG + markdown, good mobile support Cons: Medium bundle size Best for: User-friendly editing

ℹ️
A wysiwyg markdown editor post is coming soon

Thanks to the implemented monaco editor, we can just quickly tweak the content of the transcript before saving the .md

The FastAPI Speech Rater

I wanted to combine finally FastAPI (BE) x SQLITE for simple user management x A cool Astro SSG Theme

because…

How could I not try and astro theme…

MIT | Idol is an elegant landing page template for micro SaaS products built with AstroJS & Skeleton CSS

git clone https://github.com/LaB-CH3/Astro-idol
#npm run dev -- --host 0.0.0.0 --port 4321 #http://192.168.1.11:4321/

After asking to Cursor to connect the astro theme with FastAPI and make login possible via sqlite…

FastAPI x signup integrated with astro

This happened:

# Start both servers
make dev-full

#make docker-dev-build
make docker-dev-up  # Start both servers in containers FAST and ASTRO working together!!!

#cd /home/jalcocert/Desktop/py-speech-rater/fastapi-speech-rater && sqlite3 ./users.db ".schema users"
# Check all users
sqlite3 ./users.db "SELECT id, email, first_name, last_name, created_at FROM users;"

# Check specific user by email
sqlite3 ./users.db "SELECT * FROM users WHERE email = 'test@example.com';"

# Count total users
sqlite3 ./users.db "SELECT COUNT(*) FROM users;"

#sqlite3 ./users.db
#.tables
#SELECT id, email, first_name, last_name, created_at FROM users;

The setup even works with container thx to this compose

Fast API x Astro Connected

⚠️
This is a sample quick setup with a lot of auth to do’s, like httpcookie setup

Conclusions

This simple FastAPI recorder and transcript web app already helps me.

Now I can try to do those yt tech videos I wanted to do this year.

Just recording with OBS, cutting quickly with KDEnlive and recording my audio with audacity.

Then it gets uploaded into this new py-speech-rater and we get the voice via Onyx thx to OpenAI ST2 & TTS :)

ℹ️
So now i got for my yt: OBS -> Audacity -> FastAPI with OpenAI -> KDEnlive -> YT

What can be next from here?

Considering that FASTAPI and Astro can speak perfectly…

Making admin panels / dashboards / data apps ( displaying via chartjs ) with this stack does not seem that complicated anymore…

FastAPI x Astro x ChartJS

See the ./fastapi-speech-rater folder that contains those. And the related tech doc with the system’s architecture

I dont see any reason why not shipping micro SaaS faster, like:

  1. Preparing Interviews with AI

I saw something interesting at interviewsby.ai, where you upload your resume for feedback

  1. Preparing/Rating Speech with AI…

All these would need is one of those MIT Astro Micro Saas Themes + Proper email validation (logto js + csr?) + whatever backend logic via fastAPI/pb/any other


FAQ

How to get started and build a Speech Rater with AI?

git init
git branch -m main
git config user.name
git config --global user.name "JAlcocerT"
git config --global user.name
git add .
git commit -m "Initial commit: Python Speech Rater project with OpenAI TTS/S2T"

#sudo apt install gh
gh auth login
gh repo create  --private --source=. --remote=origin --push