[Speex-dev] Nelly Moser Asao Codec

Nico Gulden gulden at lisog.org
Wed Oct 11 06:40:40 PDT 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hello everybody,

first of all I want to excuse myself for crossposting. I can't estimate
which mailinglist could be the right one for our concern.

Struktur AG, member of the Linux Solutions Group e.V., wants to announce
an Open Source tender for an implementation of an audio codec compatible
with Nellymoser Asao Codec. Struktur wants to pay 6000 US$ for the
project. Proposals can be sent to tenders at struktur.de until October,
31st 2006. Deadline for the project is January, 15th 2007. For
questions, please contact me.

The whole description can be found at
http://www.lisog.org/projekte/supported_projects/bounty-programm/nellymoserasaocodec-bounty.pdf

A description about the whole idea behind the bounty program can be
found at
http://www.lisog.org/projekte/supported_projects/bounty-programm/description-en.


The idea is comparable with the Google Summer of Code. LiSoG wants to
connect skill, ideas and money to support the development of certain
features in Open Source Software.

The Linux Solutions Group e.V. (LiSoG) is an association founded in
March 2005 with the goal to promote linux based business. With its
business orientation and its cross-national approach it has an unique
position in the German-speaking area. Besides global players in the IT
industry the initiative promotes small and medium business companies
that deal with Open Source. Members are amongst others IBM, MySQL,
Fujitsu Siemens Computers, Novell, Red Hat, Collax, Abraxas for IT
providers, Stuttgarter Versicherung, Federal State of Bavaria, Schweizer
Bundesverwaltung and Universities like Linz, Heilbronn, Nürnberg and
Mannheim. Further information can be found at http://www.lisog.org/

- --
Greetings
Nico Gulden

Technical Lead
Linux Solutions Group e.V. - LiSoG

Breitscheidstr. 4
70174 Stuttgart

Phone: +49 711/90715-393
Fax:   +49 711/90715-350
E-Mail: gulden at lisog.org
Jabber: ngulden at pub.uue.org
http://www.lisog.org

Vorstand:
Dr. Karl-Heinz Strassemeyer (Vorsitzender), IBM
Volker Smid (stellvertretender Vorsitzender), Novell
Karl-Eugen Binder, Stuttgarter Lebensversicherung
Heiko Erhardt, skynamics
Dirk Haaga, Red Hat
Niels Mache, struktur
Dr. Jürg Römer, Bund ISB, Schweiz
Richard Seibt (Berater des Vorstands)

Geschäftsführer: Klaus Haasis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFLPRYLOWWwXQ1zqIRCEq8AKCQRlWYcgm5kwNF5+tSiUjqND495wCfaerK
ShUSKVgBV2op1jcYvko7oYg=
=3rrp
-----END PGP SIGNATURE-----
-------------- next part --------------
Open Source Tender: Implementation of an Audio Codec Compatible with Nellymoser Asao Codec

Project Description

The project must produce:

A open source cross-platform library or program for real-time voice data compression and decompression which is compatible to the Nellymoser Asao codec as used by the Adobe/Macromedia Flash plugin. 
We are looking for a open source implementation of a voice codec that a) encode voice audio in a binary format which can be read and decompressed to an audio stream by the Nellymoser Asao codec and b) can decompress data encoded by the Nellymoser Asao to voice audio. 
The decoding function of the codec must be capable to convert data streams produced by the Nellymoser Asao codec as used in the Adobe/Macromedia Flash Plugin into 16-bit audio streams in real-time.
The only requirement is that the data formats must be compatible. The requirement is not that encoded voice/audio must produce an identical binary data stream. Decompressing of data streams produced by the Nellymoser codec should produce an audio stream with similar quality. It is not a requirement of the decoder to produce identical audio data.      

Tendering process
struktur will pay 6000 US$ for this project. This is not a "lowest bidder" tender, Proposals will be judged on quality and risk analysis. Please send your proposal to tenders at struktur.de and we will contact you with questions.

We will consider every proposal and everyone will be contacted. You will have a chance to clarify points in your proposal. We want an open communication channel with everyone.

Project requirements
The product is expected to support all features of the Nellymoser Asao codec that are used by the Macromedia Flash plugin. The product is not expected to support any other features that are possibly part of the proprietary version of the codec.

The baseline proposal, on which the budget is based, is as follows:

A cross-platform C library or program that can decode voice data that has been compressed by the Nellymoser Asao codec as used by the Macromedia Flash plugin. The library should encode real-time sampled voice (a data stream of voice samples in 8 or 16-bit resolution with a given sampling rate) in a manner that the encoded voice data can be played back by the Nellymoser Asao decoder which is built-in into the Flash plugin. It should be able to receive a Nellymoser Asao encoded stream encoded by the Flash plugin as input and generate raw audio data and vice versa. Sampling rates should be in the range from 8 to 22 kHz. The encoding and decoding operation should be computational (and energy) efficient.

We are happy to consider other implementation proposals. Please keep in mind the following guidelines:

We like cross platform: Linux, Windows, Mac, Unix.
We like reusable components: Libraries are great.
We don't like dependencies: Especially ones that are not standard in Linux servers. Dependency on X, Java or Mono will count against your proposal

Note that the development work must be a "clean room" implementation. The development must not include code which is based on reverse engineering, decompiling or disassembling of proprietary software. The development work must not be based on any information released under non-disclosure or proprietary license, i.e. a license incompatible with open source licenses.

For your convenience we will provide sample data and streams that have been encoded with the Nellymoser Asao codec as used by the Adobe/Macromedia Flash plugin.

Copyright
The copyright of the code will remain with the project author (unless you wish to donate it to the open source community). However, you must make your work available under the Apache 2.0 and the LGPL license.

Evaluation and Payment
The payment schedule will be negotiated with the selected project's representatives. Note however that our agreement with the donor involves the following constraints:

We will receive the first half of the donation on November 15.
We will receive the second half of the donation when all projects are completed.

As a result, we can only offer an initial payment in early November and we can provide no more than half the funds prior to completion of all projects. We will make an effort to tailor the payment schedule to your needs within these constraints.

The final payment will be contingent on completion of the project to the satisfaction of review committee of struktur.

Timeline
We expect that this project will take in the order of 1-month. The project should be completed by January 15, 2007. If you can't meet that deadline please don't let that deter you from sending a proposal, but we will generally prefer proposals that do meet the deadline.

Sending your proposal
Deadline:	October 31
Length:	Approximately one side of A4/letter.
Format:	Plain text or OpenDocument Format.
Style:	Informal. Please include your name (or that of your 
company) and the email address where you wish to be contacted.
Send to:	tenders at struktur.de

You must explain in clear terms how your proposal works. How you plan to achieve the project objectives. You are encouraged to point to previous work you've done that can serve as a track record.

More information
The Nellymoser Asao voice codec produces a lossy single-channel (mono) format optimized for low-bitrate transmission of speech. The codec was developed by Nellymoser Inc. 

The Nellymoser codec is an integral part of the Flash-plugin since Flash version 6. The codec uses frequency-domain characteristics of speech for compression. The voice is ADC sampled and grouped into frames of 256 samples. Each frame is then converted into the frequency domain (by Fourier transformation) and the most significant (highest-amplitude, most energetic) frequencies are identified. A number of frequency bands are selected for encoding and the least significant bands are discarded. The bitstream for each frame encodes which frequency bands are in use and what their amplitudes are.
The encoding/decoding method is similar to standard compression techniques like MP3 or Vorbis. The main difference is that Nellymoser ASAO is optimized for voice and real-time, low bitrate, low latency encoding/decoding. 

Nellymoser ASAO packets are compressed in a fixed 1 to 8 ratio. 

The voice encoding steps of the Nellymoser encoder can be described in 4 steps:

1.Transformation:
the original 256 audio samples are transformed in the frequency domain.
2.Masking: 
the masking is applied on the frequency domain to reduce the number of the significant coefficients.
3.Quantisation:
a number of the most significant frequency coefficients are quantized.
4.Compression:
the coefficients are Huffman (or similar) encoded to reduce redundancies and to take advantage of low entropy. The binary data represented as stream or vectors (refer to MPEG-1 encoding) usually contain many consecutive zeros bits as a result of masking frequency domains and coefficient quantization.

The optimal encoding quality will be reached when masking (which is probably constant) and quantization (likely to be dynamic) parameters are adjusted in a way that the resulting compressed binary stream reduces the sampled input data by a factor of 8.

Nellymoser ASAO stream data format:
The final compressed ASAO packet is always 64 bytes long. FLV audio Tags may contain 1,2 or 4 ASAO packets. Typically there are 20-40 audio tags per second. The FLV audio tag header is 13 bytes long.

Contact
struktur AG
c/o Niels Mache
email:	tenders at struktur.de

Postal address:
Kronenstr. 22A
D-70173 Stuttgart
Germany

Internet:	www.struktur.de

Resources
Audio and voice codecs:

http://en.wikipedia.org/wiki/MP3
http://en.wikipedia.org/wiki/Vorbis

Speex free software codec for speech:
http://en.wikipedia.org/wiki/Speex
http://www.speex.org/

Scientific Paper: 
Improved Noise Weighting in CELP Coding of Speech - Applying the  Vorbis Psychoacoustic Model To Speex
http://people.xiph.org/~jm/papers/aes120_speex_vorbis.pdf

Nellymoser Voice Codec:
http://www.actionscript.org/forums/showthread.php3?t=20430
http://www.nellymoser.com/
http://www.progettosinergia.com/flashvideo/flashvideoblog.htm

Flash:
http://osflash.org
http://www.flashcomguru.com


More information about the Speex-dev mailing list