[Recap] on Audio. Creating Speech rAIter micro SaaS
Tl;DR
Doing TTS/S2T with streamlit and the st speechraiter was fun.
Now its time to build sth similar with Flask.
You can get started with this kind of project like so.
Intro
Recently I heard about myminutes.ai
so summarize meetings.
What can we build around audio/speech?
Do you imagine practicing job interviews in front of an ai?
Some kind of way to LandThatJob
Or practicing that very important presentation with a SpeechPractice service.
The Speech Rater Stack
Previously I made a PoC streamlit version:

Choosing from Streamlit to Flask vs FastAPI 📌
I just went forward with cursor and fastAPI:
And quickly made that simple UI, connected to OpenAI TTS/S2T capabilities.
With such architecture, see mermaid
As you can see on the video, where I show how it works while doing BiP, this already helps with my quick youtube video creation.
Hey, what about the speech rater stuff?
Simple md Editor
Later, on I added simple markdown edition capabilities (there were few candidates)
Monaco Editor (VS Code Editor) - RECOMMENDED (and this is the one cursor went for, enough for a quick edit) ⭐ Pros: Full VS Code experience, syntax highlighting, IntelliSense, built-in markdown preview Cons: Larger bundle size (~2MB) Best for: Professional editing experience
CodeMirror 6 - LIGHTWEIGHT ⭐ Pros: Lightweight, fast, good markdown support, customizable Cons: Less features than Monaco Best for: Balanced performance and features
SimpleMDE (Markdown Editor) - SIMPLE Pros: Very lightweight, live preview, easy to use Cons: Less advanced features Best for: Simple editing needs
Toast UI Editor - MODERN Pros: WYSIWYG + markdown, good mobile support Cons: Medium bundle size Best for: User-friendly editing
Thanks to the implemented monaco editor, we can just quickly tweak the content of the transcript before saving the .md
The FastAPI Speech Rater
I wanted to combine finally FastAPI (BE) x SQLITE for simple user management x A cool Astro SSG Theme
because…
How could I not try and astro theme…
MIT | Idol is an elegant landing page template for micro SaaS products built with AstroJS & Skeleton CSS
git clone https://github.com/LaB-CH3/Astro-idol
#npm run dev -- --host 0.0.0.0 --port 4321 #http://192.168.1.11:4321/
After asking to Cursor to connect the astro theme with FastAPI and make login possible via sqlite…
This happened:
# Start both servers
make dev-full
#make docker-dev-build
make docker-dev-up # Start both servers in containers FAST and ASTRO working together!!!
#cd /home/jalcocert/Desktop/py-speech-rater/fastapi-speech-rater && sqlite3 ./users.db ".schema users"
# Check all users
sqlite3 ./users.db "SELECT id, email, first_name, last_name, created_at FROM users;"
# Check specific user by email
sqlite3 ./users.db "SELECT * FROM users WHERE email = 'test@example.com';"
# Count total users
sqlite3 ./users.db "SELECT COUNT(*) FROM users;"
#sqlite3 ./users.db
#.tables
#SELECT id, email, first_name, last_name, created_at FROM users;
The setup even works with container thx to this compose
Conclusions
This simple FastAPI recorder and transcript web app already helps me.
Now I can try to do those yt tech videos I wanted to do this year.
Just recording with OBS, cutting quickly with KDEnlive and recording my audio with audacity.
Then it gets uploaded into this new py-speech-rater
and we get the voice via Onyx thx to OpenAI ST2 & TTS :)
What can be next from here?
Considering that FASTAPI and Astro can speak perfectly…
Making admin panels / dashboards / data apps ( displaying via chartjs ) with this stack does not seem that complicated anymore…
See the
./fastapi-speech-rater
folder that contains those. And the related tech doc with the system’s architecture
I dont see any reason why not shipping micro SaaS faster, like:
- Preparing Interviews with AI
I saw something interesting at interviewsby.ai
, where you upload your resume for feedback
- Preparing/Rating Speech with AI…
All these would need is one of those MIT Astro Micro Saas Themes + Proper email validation (logto js + csr?) + whatever backend logic via fastAPI/pb/any other
FAQ
How to get started and build a Speech Rater with AI?
git init
git branch -m main
git config user.name
git config --global user.name "JAlcocerT"
git config --global user.name
git add .
git commit -m "Initial commit: Python Speech Rater project with OpenAI TTS/S2T"
#sudo apt install gh
gh auth login
gh repo create --private --source=. --remote=origin --push