Skip navigation

The o2jam OJM format is a container file with all the key sound effects and background sounds from the music.

Actually, the OJM files can have different formats within it, so far we know 3 types:

M30: this format contains sounds in OGG format and a (stupid-)simple encoding.

OMC and OJM: these can be parsed by the same scheme so I will put them together,
they can contain PCM(raw WAV) data or OGG, in a rather evil (still stupid) encoding.

To make things easier for everyone, and demonstrate the algorithm, I made a simple program that gets a OJM file and extracts all sounds of it, you can download it here, the source code is here.

The M30:

M30 files start with this header( 28 bytes in total):

struct M30_header {
char signature[4]; // "M30"
int file_format_version;
int encryption_flag;
int sample_count;
int samples_offset;
int payload_size;
int padding;

file_format_version: internal version of the OJM format.

The encryption_flag says how the data was encrypted:
1 – scramble1
2 – scramble2
4 – decode
8 – decrypt
16 – nami

sample_count holds the number of sound files inside the OJM,
but we don’t rely on this much because we have found differences in the number here and the actual sounds in the file.

samples_offset says where the data actually starts, usually after the header(28).

the payload_size is the size of the rest of the file, usually total_file_size – 28(header), in bytes.

Now, the rest of the file, is the sound data, or samples, as I call them.

Each sample has a header too, 52 bytes:

struct M30_sample_header {
char sample_name[32];
int sample_size;
short codec_code;
short unk_fixed;
int unk_music_flag;
short ref;
short unk_zero;
int pcm_samples;

sample_name: is a string with a name for the sample.

sample_size: is the size, in bytes, of the actual sample data.

codec_code: there are 2 types(so far) on the M30, 0 means background sound(note type 4, M###), 5 means normal sound (W###).

ref: this is correspondent to the note ref, that means the sample ref X will be played when the note ref X is tapped.

pcm_samples: the total pcm samples on the ogg.

Now, the next sample_size bytes are the sound data.
The encryption_flag determines what we do to extract the data, so far I only have come across 16(nami) and numbers greater than that are plain ogg.

In the nami case, we need to XOR all the data with ‘nami’, this code illustrates this:

for(int i=0;i+3 < data.length;i=i+4) {
data[i+0] = 'n' XOR data[i+0];
data[i+1] = 'a' XOR data[i+1];
data[i+2] = 'm' XOR data[i+2];
data[i+3] = 'i' XOR data[i+3];

* is important to notice that if the data isn’t multiple of 4 the remaning bytes are NOT xor’ed.

After this, the data now contains the sound in OGG. Following this is another sample header, and so we repeat until all the file is read.



the header is 20 bytes:

struct OMC_header {
char signature[4]; // "OMC" or "OJM"
short wav_count;
short ogg_count;
int wav_start;
int ogg_start;
int filesize;

wav_start is 20 usually, as the wav data starts after the header.

ogg_start is the offset of the OGG data, usually just after the wav data.

filesize is the file size, really.

wav_count and ogg_count are supposed to be the number of samples in each section.

So, after this header we got the wav data section, from here to ogg_start we do this:

read a sample header( 56 bytes):

struct OMC_WAV_header {
char sample_name[32];
short audio_format;
short num_channels;
int sample_rate;
int bit_rate;
short block_align;
short bits_per_sample;
int unk_data;
int chunk_size;

sample_name: is a string with the name of the sample.

audio_format, num_channels, sample_rate, bit_rate, block_align and bits_per_sample are needed to make the WAV header, otherwise we can’t play the sound.

chunk_size: the size of this sample data.

Now, unless chunk_size is zero( yes it happens more often than you think), we read the chunk data, and:

remember the evil-stupid encoding I mentioned ? well basically they shuffle the sound data according to some logic, why ?, go figure. If you want to take a look at it anyway, see this file and look for the rearrange function.

after, un-shuffling the data, there’s still another dose of WTF, with what they call, “ACC_XOR” ( but isn’t xor really), it’s also on the file I mentioned.

finally, after all that, we have raw WAV data now, but don’t forget the header variables, it’s useless without them.

we repeat all this until we get to ogg_start, then we change to this:

header of 36 bytes:

struct OMC_OGG_header {
char sample_name[32];
int sample_size;

Yep, that’s it, sample_size is the data size, and sample_name is the name.

Funny enough, we have pure OGG data here, no tricks.

all the samples read on the WAV block are key sounds (W###), and all read in the OGG block are background (M###).

And that’s all, for now.

Also, as far as I know, the only OJM this doesn’t read is the new o2china songs ( which apparently uses blowfish encryption ?), if you
find another OJM that this program doesn’t read let me know.


  1. hi! your program works like a charm! 🙂
    hhm, any idea as to how we could mix all of these ogg files?
    i tried it on o2ma197.ojm and it gave me 363 ogg files.

    • mix you mean join them together ? I don’t know, some audio editor program maybe ?

  2. Hey. It’s me again, haha.

    A few pointers:

    struct OMC_header {
    char signature[4];
    short wav_count;
    short ogg_count;
    int wav_start;
    int ogg_start;
    int filesize;

    struct M30_header {
    char signature[4];
    int file_format_version;
    int encryption_flag; //1-scramble1, 2-scramble2, 4-decode, 8-decrypt, 16-nami.. vittee figured this one via disassembly
    int samples_count;
    int samples_offset;
    int file_size;
    int padding;

    Now, I don’t remember this one that well, and I wasn’t even that sure back then… But it goes something like..

    struct M30_sample_header {
    char sample_name[32];
    int sample_size;
    int codec_code; //again, vittee got this. didn’t know much about it though.
    int music_flag; //something like that.
    short ref;
    short unk_zero;
    int pcm_samples; //i’m sure about this one, although I’m not sure why they needed this. it has to be the total pcm samples on the ogg.

    • Hello again,

      wow, thanks a lot for this, again we are so close to complete this, just the encryption_flag made me re-think, and I’m probably going to search for OJM with that other codes now, also I’m not sure codec_code is int, so far the last 2 bytes are fixed, but anyway, this was of great help, thanks !

        • MrKishi
        • Posted January 28, 2011 at 5:52 am
        • Permalink

        acc_xor is not symmetric, by the way.
        The algorithm on github is actually for the encoding/encryption.

        And also treat the counter at the end of the for loop, or save “temp” along with “acc_keybyte” and “acc_counter”, or else you may get incorrect bytes when the for-loop ends right after a “counter++ == 8” (temp will be 0 on the next function call).

        • DarkFox
        • Posted January 31, 2011 at 10:00 pm
        • Permalink

        hey again,

        I tried to test make some tests but, honestly, it got worst, that or I didn’t quite get what you mean.. the counter treatment I agree, but how is this the encoding method ?

        • MrKishi
        • Posted January 31, 2011 at 11:07 pm
        • Permalink

        Woops. I actually misnamed my methods while reverse-engineering. That _is_ the decryption, you don’t need to inverse it.
        The ‘temp’ point is still valid, though. I’d actually encourage you to shift your logic, moving the conditionals after the complement so you don’t have to keep ‘temp’ between calls. Sample c code:

        char *flag4_bit_decrypt(char *sample_data, int sample_size, char reset_key) {
        static char key = INITIAL_KEY;
        static unsigned char bit = INITIAL_BIT;

        char *current;
        char *end = sample_data + sample_size;

        if (reset_key) {
        key = INITIAL_KEY;
        bit = INITIAL_BIT;

        for (current = sample_data; current >= 1;
        if (!bit) {
        bit = INITIAL_BIT;
        key = original_current;

        return sample_data;

        reset_key is used on certain types of files (if I recall correctly, M30 with encryption flag 4 active) for the encryption to work per-sample instead of per-file.

        • MrKishi
        • Posted January 31, 2011 at 11:09 pm
        • Permalink

        Oh, and of course:
        const char INITIAL_KEY = 0xFF;
        const unsigned char INITIAL_BIT = 0x80;

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: