Wednesday, September 20, 2023

Encoding MP3 for CW Practice

Pangram Audio Files


In my CW-Ops Academy class, we are learning to copy simple phrases and words.

Among those phrases are lines of text called pangrams.

If the word pangram is unfamiliar, I'm sure this is not:

The quick brown fox jumps over the lazy dog.

That's a pangram.

In the class we were given a list of them and asked to practice sending them.  I wanted to practice copying them so I used some software, and wrote some more to support that in order to make these encodings.



In order to share the knowledge for how to do this yourself, here are the steps below.

Just a note -- it will require a bit of technical know-how to get setup.  I don't have it all packaged up so nicely that it is a simple "application" to download.  You'll have to make the environment yourself.

I did so on a Linux system.  I'm sure it's possible to do this on Cygwin or even in Windows, but the steps for that are more complicated.   The key is that these tools are scriptable.   They are not GUI applications.  We cannot use the GUI application to batch process strings of text into MP3s.

The first step is to get the main software that can encode text into WAV files:  WAV files are not the final product we want, but we need the WAV files first.

(If you don't have git, then I'd reconsider this process.  It needs some skill in getting the software installed correctly.  If you've never used git, then you're likely not to have used apt, which means unless you want to do some additional homework, the process may frustrate you.  But you are encouraged to carry on if you wish.)

The tool we need is  a python script that someone wrote to convert text into WAV files.  Many tools can probably do this, but this one is very simple and command-line oriented.  We do not want a GUI application.

That tool is located here:


So, in Linux  (or any shell where you have git), clone the repo.

$ git clone https://github.com/cduck/morse.git

Don't use the software there yet.  We need another tool.  We need to get ffmpeg

$ sudo apt install ffmpeg

ffmpeg is a package that you can install on your system.

Once those tools are installed, here is how you can test them.

$ cd morse

That takes you into the directory of the repository you cloned above.

Then type this:

$ echo "hello world" | python3 play.py -f 650 --fs 10 --wpm 20 -o input.wav

What that does is offer the string "hello world" to the python script play.py.

-f 650
--fs 10
--wpm 20
-o input.wav

Here is what they mean:

-f N   is to set the tone frequency (Hz) of the CW.   Choose N that correct for your use case.

--fs N  is to set the Farnsworth speed of the words.  The lower the value N, the more space there is between the letters.

--wpm N is to set the words per minute speed (this is effectively the character speed).  Each character of each word is timed to this speed.  If you are used to setting your "speed" dial on your radio for CW, this is that setting.

-o FILE  is to specify which FILE to use when writing out the resulting WAV file.

Usually it makes sense that the "wpm" speed is always greater than or equal to the Farnsworth speed. It doesn't make any sense that Farnsworth speed is greater than the WPM speed.

If you want the audio to sound like 25 wpm but the space between the letters and words is evident, then set a Farnsworth speed to about half of the wpm speed.  Make a file from that and then adjust the parameters to suit your need.

But you need that first file generated, let's continue.

Some systems may let you actually play WAV files as-is, and if that is the case, you can try to do that.
But this process has a second step which is to convert the WAV files into MP3.

We need the ffmpeg application to do that:

Here's the command:

$ ffmpeg -i input.wav -vn -ar 44100 -ac 2 -b:a 192k sample.mp3

This application ffmpeg has many options, we're only using a fraction of them.

-i FILE instruct ffmpeg which file to read from for the source audio file.
-vn  is a flag to block all video to be part of the result.
-ar is to set the audio stream sampling frequency.   44.1 kHz is about the sampling frequency of a typical Audio CD.
-ac is to set the number of audio channels.  We want to set two audio channels for left and right.
-b:a N  is to set the output result MP3 bitrate.  192k is perfect.  The human ear cannot really easily discern better fidelity beyond 192k for usual listeners.  Audiophiles may have better skill to do so, but 192k is plenty of data.

When that finishes, there will be a sample.mp3 file.

That's the result we wanted, the MP3 of the text.

The Morse software that was cloned has knowledge of a few "pro-signs" so you can be sure that if you use prosigns the encoded audio will sound right.

For instance. for BT use the = character in your text.  You'll get what you want.  If you use BT, you'll get the tiny gap between the B and the T, and it won't sound like the prosign of BT.

For AR, use the + character.  Same situation as with BT.

That's all there is to it.

What you can do next (what I did) is write software around this to automate the process so that I can generate a large number of encodings at different speeds without any manual command line invocations.

I'm sure if you got this far, you know exactly what you want to do next.

Good luck.

QSO Files


The other kind of encoding that you might want to do is one where two (or more) stations are involved.

The instructions above will produce a single MP3 at a given speed, and tone frequency.

What if you wanted to make a real QSO sound --- two stations, each with a different tone frequency and perhaps even different speeds.  You'll need to make several MP3, one for each passage (or send) of the whole QSO.

But what you'll have is a set of MP3 files, not one MP3 file that can be played from.. one file.

So ffmpeg can also help there.  After you've made the MP3 files of each "send" of the QSO then do this:

1. Make a file called manifest.txt

In the file manifest.txt list ALL of the MP3 files involved in the QSO

But they have to be in this format:

file 'vvv.mp3'
file 'cq.mp3'
file 'answer.mp3'
file report.mp3'
file 'response.mp3'
file 'tu.mp3'

In this example the file names vvv.mp3 (which is just a recording of the  string "VVV", nothing more), and the rest of the files are arbitrary.  You can name them however you want as long as that is the name you used when you generated the MP3 file.

And, most important, there is no rule or limit on how few or how many files make up the manifest.  In this example, I used six files.  You may use any number of files you want as long as the file names are listed as shown   file 'FILE'   The single quotes is required.  

You don't necessarily even need to name the manifest file manifest.txt.  You can name it whatever you want as long as you use the same filename in the next step 2.

2. With the manifest made, and the constituent MP3 files already made as shown above, then use ffmpeg to concatenate the audio files to make one MP3 file.

$ ffmpeg -f concat -i manifest.txt sample.mp3

As you can probably guess, the tool uses the manifest.txt to iterate over each file to concatenate and then produce a final product sample.mp3 from that.

So going back to your challenge of making the QSO -- that is where the selection of the TONE frequency is important if you want to differentiate one station from another.   Even slightly different speeds can help make the QSO sound a bit more realistic.




No comments:

Post a Comment

VK2/W7BRS QSL .. Now where?

 VK2/W7BRS QSL The QSL Cards have arrived from the printer. You can get a QSL card two ways: By the QSL Manager,  M0URX    (Highly Recommend...