“Make amateur radio cool again”, said Mr Artificial Intelligence.
A project on building a speech recognition system for amateur radio communication.
I speak to a Baofeng handheld radio transceiver, the DIY antenna picks up the radio wave, SDR demodulates the radio signal to standard audio signal, Google speech-to-text performs the speech recognition, Smith-Waterman algorithm performs sequence alignment to find the most probable call sign in a database and AJAX is used in a local httpserver to output the text. The system diagram is shown below.
This is the demo. It successfully finds the most probable call sign from a database and captures the message “monitoring”.
Hardware (antenna, SDR dongle, FM radio transceiver):
I attach 0.5 meter (~wavelength/4) copper wire to connectors to form my antenna. I buy all my connectors from Amazon and the copper wire from a local hardware store.
The parts for the antenna are
- 0.5 meter copper wire with diameter 0.15mm
- UHF Female Jack Solder SO-239
- RF coaxial coax adapter F female to UHF male PL-259 connector
- N Female to F Male Connector RF Coax Coaxial Adapter
- DHT Electronics RF coaxial coax cable assembly N male to MCX male right angle 6'’
I use a hand-held FM radio transceiver (Baofeng UV-5R v2+) for ham radio transmission. NooElec SDR dongle is used to receive the signal from the antenna and send it to my laptop. They are both available on Amazon.
The complete hardware setup is shown below.
Software-defined radio/digital signal processing:
According to Wikipedia,
Software-defined radio ( SDR ) is a radio communication system where components that have been traditionally implemented in hardware (e.g. mixers , filters , amplifiers , modulators / demodulators , detectors , etc.) are instead implemented by means of software on a personal computer or embedded system .
I use SDRSharp to do all the signal processing and conversion. A screenshot of a working SDRSharp is shown below
I use Google Speech-to-Text API for the speech recognition.
Here is the snippet.
But as shown in the demo, Google speech-to-text seems to struggle a bit and produces some errors. This is probably due to the noise in the background. I think this can be improved if one can train a deep net with training set taken from real ham radio conversations.
Very likely, the speech recognition system will make some errors. I alleviate that problem by keeping a database of call signs. The database can be constructed with data from a variety of sources. For example, for aircraft radio communication, dump1090 is a nice program that can capture the information of aircraft by decoding messages sent on 1090MHz. Alternatively, one may simply use a logbook that stores the call signs of people who frequently use local repeaters.
Once we have a database of call signs, we can use Smith-Waterman alignment algorithm to find the best match.
Here is the snippet.
For real-time update, I use AJAX. AJAX enables automatic fetching of information from a file which contains the call signs/speeches/timestamps.
I use a simple httpserver locally by typing the following command in the terminal.
python -m SimpleHTTPServer
The snippet for the webpage is shown below.
A screenshot of the webpage is shown below.
Many people says that amateur radio is a dying hobby. I find it sad because there are many interesting things one can do with it, especially when it is combined with more recent technology(like AI). So, I decide to work on this project and share it with you all. I hope you find it fun.
- 弥合 AI 大规模落地的巨大缺口！阿里、腾讯、百度等联合推出互联网服务 AI 基准
- “人工智能 制造”的机遇在哪里？