Audio-MPEG

 view release on metacpan or  search on metacpan

MPEG.pm  view on Meta::CPAN

even on non-optimized machines, such as the PowerPC, it performs quite well
(faster than real-time on late 90's (and later) machines).

=head2 MAD

This is a relatively new MPEG decoding library. I chose it after struggling
to clean up the MPEG decoding library included with LAME (which is based
on Michael Hipp's mpg123(1) implementation). In the end, I was very pleased
with the results. MAD performs it's decoding with an internal precision
of 24 bits (pro-level quality) with fixed-point arithmetic. The code
is very clean, and seems rock-solid. Although it may seem that it should
be faster than the mpg123(1) library due to the use of fixed-point arithmetic,
it is in fact about 60% or so of the speed (due to the higher resolution
audio). However, the ease of coding against B<MAD>, and the higher
precision of the output more than makes up for the slower decoding.

B<Audio::MPEG> can export the data at it's highest precision for programs
that wish to manipulate the data at the higher resolution.

=head2 Operating System Environment

I have only tested this on a Linux 2.4.x system so far, but I see no
reason why it should not work on any Un*x variant. In fact, it may actually
even work on a Windoze box (the underlying LAME and MAD libraries apparently
compile somehow on them). I am doing no special magic with the interface,
so presumably it will work under Windows. As you can probably tell, I
don't really care if it does (I'll may start caring if M$ releases
the source code to Windows under GPL, BSD, or Artistic licenses...). But,
for you poor, misguided souls that insist upon running Windows, I expect
that there should be little problem getting it to work.

=head2 Performance

You would think that with encoding/decoding audio, which is quite a
compute-intensive task, Perl would be much slower than the equivalent pure
C programs. Surprise... it is only about 3% slower (!) Even with the
mechanism I use here (Perl->C->Perl for B<every> frame, Perl 5.6.1 and
Linux 2.4.4 (PowerPC 7500) performs just fantastic. So, the moral of this
paragraph is to run your own performance tests, but there's no need to
think of your own Perl encoder/decoder will be inferior to a pure C/C++
implementation. The only drawback is that, depending upon how much
buffer space you use for reading, memory usage will be at least 3 times
as much (eh... RAM is cheap...)

=head1 INTERFACE

=head2 Audio::MPEG

This is simply the package that bootstraps the XS library, and there
is no external interface.

=head2 Audio::MPEG::Decode

=over 4

=item B<new>()

This creates a new object. Each object has it's own private context, so
it is possible to have more than one object created at a time.

Once a stream has started to be decoded, the object may only be used for
that stream (due to state information kept in the object).

=item B<$len> = B<buffer>(I<$data>)

This method adds an arbitrary "chunk" of input MP3 data to the internal
buffering pool. Typically, this is at least 4KB of data. A good length
of data to pass is 40KB (approximately 1 second of audio encoded at 320kbps
or 2.5 seconds of audio encoded at 128KBs).

Method returns the length of data, in bytes, that has not be decoded yet.

=item B<decode_frame>()

This method will process the next MP3 frame of the data that has been
buffered with B<buffer>(), prepares it for PCM synthesis. The prepared
data is stored in the object. Do not use both this function and
B<decode_frame_header>() on the same object.

Method returns 1 if a frame was decoded (successfully or not), and 0
if it ran out of data before finishing decoding.

Upon return, program should interrogate I<$obj->err>. If it is > 0, then
a decoding error has occurred, and no PCM synthesis is possible (i.e. frame
should be skipped). See the EXAMPLES section later in this document.

=item B<decode_frame_header>()

This method will process the next MP3 frame of the data that has been
buffered with B<buffer>(), and does B<not> prepare it for PCM synthesis.
The intent of this function is to verify the framing of the MP3 stream
for a rapid integrity check of the file. It is not a complete
check, as that is possible only with full decoding. However, simply
performing this framing check will catch the majority of errors found with
MP3 files. Do not use both this function and B<decode_frame>() on the same
object.

Method returns 1 if a frame was parsed (successfully or not), and 0
if it ran out of data before finishing parsing.

Upon return, program should interrogate I<$obj->err>. If it is > 0, then
a decoding error has occurred (i.e. frame
should be skipped). See the EXAMPLES section later in this document.

=item B<verify_mp3file>($I<filename> [, $I<full_verify>, $I<num_errs>])

This is a convenience function that will return 1 if the MP3 file
has less than 5 framing errors, or undef if there were more problems.

If the second parameter is 1, a full decoding of the MP3 file will
occur. If undef, it will only decode the frame headers and not the data
as well.

This may be further tuned by passing a third parameter that indicates the number
of errors to be found before declaring the verification a failure.

Method returns 1 if file is OK, undef if damaged.

=item B<synth_frame>()

This method will synthesize the PCM data for a single frame that was
prepared by B<decode_frame>(). The output PCM frame is stored in the object.

=item B<err>()

Returns the last error code, or 0 if no error. This, or err_ok(), should be
checked after every B<decode_frame>() or B<decode_frame_header>() call.

=item B<err_ok>()

Returns the 1 if the error is recoverable, or 0 if it's a bad error. This,
or err(), should be checked after every B<decode_frame>() or
B<decode_frame_header>() call.

=item B<errstr>()

Returns an English string describing the error condition.

=item B<current_frame>()

Returns the current MP3 frame that was decoded.

=item B<total_frames>()

Returns the total number of MP3 frames decoded. Used after decoding has
been completed.

=item B<frame_duration>()

Returns the length of the frame, in seconds, that was decoded.

=item B<total_duration>()

Returns the total duration, in seconds, that was decoded. Used after decoding
has been completed.

=item B<bit_rate>()

Returns the bitrate, in kbs, of the frame that was decoded.

=item B<average_bit_rate>()

Returns the average bitrate of the decoded frames. Used after decoding has
been completed.

=item B<sample_rate>()

Returns the samplerate, in Hertz, of the decoded frame.

=item B<layer>()

Returns the MPEG audio layer number of the decoded frame.

=item B<channels>()

Returns the number of PCM channels that were decoded (1 or 2) of the
decoded frame.

=item B<pcm>()

Returns the synthesized PCM structure of the decoded/synthesized frame.
This format is in a 24bit fixed-point format, and is only intended for
passing to an B<Audio::MPEG::Output> object. It is also intended to be
used by a planned future filtering object.

=back

=head2 Audio::MPEG::Output

=over 4

This creates a new object. Each object has it's own private context, so
it is possible to have more than one object created at a time.

The parameters for new are as follows:

=item B<new>(I<\%parameters>)

=over 4

=item I<out_sample_rate>

The target output samplerate (in Hertz). If this does not match the input
samplerate of the PCM samples, it will be resampled. Default is 44_100.

=item I<out_channels>

The number of output channels. If different from the input PCM samples,
it will be adjusted (mono->stereo or stereo->mono). Default is 2.

=item I<mode>

The algorithm used to decrease the precision of the input samples to match
the output precision. Valid values are 1 for simple rounding, and 2 for
dithering. Default is 2.

=item I<type>

The output stream format. Valid values are 1 (unsigned 8 bit PCM),
2 (signed 16 bit PCM), 3 (signed 24 bit PCM), 4 (signed 32 bit PCM),
5 (4 byte float PCM), 6 (8 bit Sun mulaw), and 7 (Microsoft WAV).

All PCM formats are in the B<native> (i.e. big or small endian) format
of the machine that generates the output. Default is 5.

=item I<apply_delay>

This will correct for the MP3 decoding delay. If set to 1, 1/2 of the first
frame's PCM stream will be skipped and not converted to an output stream.
Default is to not correct for delay.

=back

=item B<header>($I<datasize>)

This method will return a header (first few bytes of data) that is valid
for the output type. $I<datasize> refers to the length of audio data in bytes.
If not passed the length, B<header>() will output a valid header, except that
the embedded length will be zero. After the sample is decoded, header is
typically called again, and re-written to the beginning of the file (see
the EXAMPLE section of the document). If called with an object type that
does not have a header, this returns an empty scalar.

Currently, only Sun mulaw and WAV formats have headers.

=item B<encode>($I<pcm>)

This method will encode an input PCM stream and return a scalar containing
the output audio stream. Input is typically the output of
Audio::MPEG::Decode->pcm method.

=item B<clipped_samples>()

Returns the number of samples that had to be clipped to fit in the output
format.

=item B<peak_amplitude>()

Returns the amplitude, in decibels, of the highest sample.

=back

=head2 Audio::MPEG::Encode

=over 4

=item B<new>(I<\%parameters>)

This creates a new object. Each object has it's own private context, so
it is possible to have more than one object created at a time.

The parameters for new are as follows:

=over 4

=item I<in_sample_rate>

This is the input sample rate (in Hertz) and is used to in decoding the PCM
stream passed to the I<encode>() methods of this class. Allowed values are
8_000, 11_025, 12_000, 16_000, 22_050, 32_000, 44_100, and 48_000. If not
set, it will default to 44_100.

=item I<in_channels>

This is the number of input channels and is used by the I<encode>() methods
of this class in decoding the PCM stream. If not set, default is 2 channels
(stereo input).

=item I<out_sample_rate>

If set, the output sample rate (in Hertz) is (possibly) resampled to match this.
Allowed values are 8_000, 11_025, 12_000, 16_000, 22_050, 32_000, 44_100,
and 48_000. If not set, the LAME library will automatically select the best
output sample rate based on the other settings (such as compression ratio
or bitrate).

Note that this setting is B<independent> of the input sample frequency: LAME
will resample if required.



( run in 0.385 second using v1.01-cache-2.11-cpan-d7a12ab2c7f )