Difference between revisions of "User:Mjb/Workspace"

From Offset
Jump to navigationJump to search
(started new section: Custom compiling AACGain)
(Custom compiling AACGain: +steps taken)
Line 187: Line 187:
  
 
==Custom compiling AACGain==
 
==Custom compiling AACGain==
I'm going to try to build an [[:wikipedia:Streaming SIMD Extensions|SSE]]-optimized (Pentium III and up) aacgain.exe. I'll trace my steps here.
+
I'm going to try to build an [[:wikipedia:Streaming SIMD Extensions|SSE]]-optimized (Pentium III and up) version of [http://altosdesign.com/aacgain/ AACGain]. I'll trace my steps here.
  
* Download [http://msdn2.microsoft.com/en-us/vstudio/aa700736.aspx Visual C++ 2005 Express Edition] from Microsoft. Install it (reboot required). Familiarize self with <code>[http://msdn2.microsoft.com/en-us/library/7t5yh4fd(VS.71).aspx /arch:SSE]</code>
+
Build instructions are in the [http://mp3gain.cvs.sourceforge.net/mp3gain/aacgain/README?view=markup AACGain README].
 +
 
 +
* Download [http://msdn2.microsoft.com/en-us/vstudio/aa700736.aspx Visual C++ 2005 Express Edition] from Microsoft. Install it (reboot required).
 +
* Familiarize self with <code>[http://msdn2.microsoft.com/en-us/library/7t5yh4fd(VS.71).aspx /arch:SSE]</code> option.
 +
 
 +
* Download [http://www.tortoisecvs.org/download.shtml TortoiseCVS]. Install it, choosing to restart Explorer when prompted. I was also prompted to restart Windows for changes to CVSNT to take effect. I chose not to do that because the setup hadn't finished!
 +
 
 +
* The Microsoft Platform SDK is required. There are [http://msdn2.microsoft.com/en-us/vstudio/aa700755.aspx  instructions for setting up Visual C++ Express to work with the SDK], but they tell you to get the latest version: Windows Server 2003 R2 SDK - March 2006 Edition. This can build Win2K apps but apparently doesn't run on Win2K, itself. Since I'm doing the build on Win2K, I'll download and install the older version: [http://www.microsoft.com/downloads/details.aspx?familyid=A55B6B43-E24F-4EA3-A93E-40C0EC4F68E5&displaylang=en Windows Server 2003 SP1 Platform SDK - April 2005 Edition].

Revision as of 18:10, 6 February 2008

Misc

file URI related

Important links

  • IETF and the RFC Standards Process, from The Art of Unix Programming by Eric Steven Raymond, emphasizes that Internet RFCs and standards tend to be based more on actual implementation than pie-in-the-sky theory
  • File system info by Chris Giese in addition to providing various FAT technical details, gives additional details about encodings, legal characters, and limitations of FAT12, FAT16, VFAT, FAT32, NTFS, ext2, ISO9660, Joliet, and HFS+
  • Wikipedia:en:Comparison of file systems covers a lot of ground, and links to separate articles about each file system. Check the discussion page as well.
  • This page from IBM's WebSphere CORBA documentation is an example of an implementation expecting to see ":" and "\" in a 'file' URL
  • This Lynx documentation shows how an implementation might treat '~' specially in a 'file' URL

Mailing list posts and notable quotes

'file' URI conventions (13 July 2004) - Mike Brown brings up many issues that complicate the mapping of file system paths to URIs

What to do about file: (19 August 2004) - Paul Hoffman points to a now-expired Internet-Draft that was just RFC 1738's 'file' URI section pulled out into a separate document, and asks about courses of action:

  • Publish it as-is (which would accomplish nothing other than hastening the retirement of RFC 1738)
    • "no" — Larry Masinter
  • Prescribe what implementations SHOULD do, knowing that such a prescription is bound to break many/most existing implementations
    • "this would be useful if it were accompanied by documentation of the caveats." — Larry Masinter
  • List many more interpretations that current implementations use, but not say whether or not to do them
    • "I propose a variant of this: list the interpretations known to be in use, labeled with who uses them, in an informative section." — John Cowan
  • Say more about the wide variety of interpretations, but don't list them so as not to confuse readers

An RFC that says, essentially, "Internet Explorer on post-4.0 versions on Windows platforms does X, while Gecko-based engines on linux platforms do Y, on Windows platforms do Z, while the popular LWP perl library does W, java.net.URI does U…" would feel profoundly weird to me. — Tim Bray [1]

Not all RFCs prescribe standards, and this is information that would be profoundly useful to the Internet community. … It would be excellent to have a single reasonably authoritative place to go, rather to have to run one's own experiments all the time. — John Cowan [2]

RE: What to do about file: (19 August 2004) - Larry Masinter suggests some topics to cover

File system path to URI

When converting from any file system path to a URI, questions to consider include the following.

For what kind of file system is the path?

  • MS-DOS and Windows: FAT16, VFAT, FAT32, NTFS
  • Unix-like OSes: UFS, UFS2, ext2, ext3, ReiserFS V3, Reiser4
  • Legacy Mac OS: HFS+

There are differences in how these file systems store directory entries, what characters they allow, how paths manifest in internal APIs, and how paths manifest to the end user of the OS.

If the path's file system is not known…

…what should you do?

  • Assume the path is from a default file system for the local OS? Many OSes offer a choice of file systems. How can you be sure you got it right? Is there a "good enough", file system-agnostic fallback?
  • Maybe just reject the path? IOW, just say that the file system type must be known.

If the path's file system is not recognized…

…what should you do?

  • Reject the path?
  • Use a default algorithm, like just prepending 'file:' and doing whatever percent-encoding is required?

Is the path 'absolute'?

  • If it's a UNIX path, whether it starts with "/" is the only qualification, I believe.
  • If it's a Windows path, it could be absolute if it matches the regular expression ^(\\|[A-Za-z]:) - that is, it either starts with "\" or a drivespec (an ASCII-range letter followed by ":").

If the path is not absolute…

…what should you do?

  • Reject it?
  • Create a relative URI reference? ('the/path')
  • Create an RFC 3986-compliant, but RFC 1738-offending, URI like 'file:the/path'?
  • Attempt to make the path absolute by interpreting it to be relative to the local host's 'current working directory', if such a concept exists in the local OS? What if the path is for some other file system?
  • And do you make it absolute according to the file system's conventions first, or do you do an RFC 3986 conformant resolution of a relative URI reference ('the/path') against the base URI that is derived from the current working directory?

Does the path contain same- or parent- (. or .., for example) references?

Do you attempt to collapse dot segments (or equivalent) in the path or in the resulting URI? Does it depend on whether the path or URI is absolute? A reason to collapse dot segments in an absolute URI is so that the URI can be suitable for use as a base URI for RFC 3986 conformant resolution.

Is the mapping between segments in the filesystem path and segments in the path component of the URI well-defined?

On Unix file systems, it should be sufficient to percent-encode all non-unreserved characters. Note that '/' may appear *within* a segment, though (you can put a slash in a filename), so be sure to apply percent-encoding to each segment individually.

On Windows, complications abound. (I think I cover these below)

If the path purports to be for a particular OS, but does not match that OS's syntax for a path, e.g. 'C:/autoexec.bat' on Windows…

  • Reject the path?
  • Be as lenient as possible, e.g. replace '/' with '\' for Windows?
  • What about '9:\autoexec.bat' on Windows (bad drivespec)? acceptable?

If the path is provided as a sequence of Unicode characters…

  • Form the URI by leaving unreserved characters as-is, and percent-encoding the rest, using UTF-8 as the basis? (RFC 3986 default)
  • Use some other encoding more appropriate to the path's OS?

If the path is provided as a sequence of bytes

(not Unicode characters, with no additional info about encoding)…

  • Reject it because it can't be decoded to Unicode?
  • Assume a default encoding? based on...? How confident can you be about, say, a file system default encoding? (probably not very, on Unix)
  • Attempt no decode; just form the URI by converting to unreserved characters only those bytes that, when decoded as ASCII, correspond to unreserved characters, and percent-encoding the rest of the bytes individually?

For a Windows path, is it in the form of a local path or a UNC path?

("local" may not be the right term)

  • local, absolute, with drivespec: C:\autoexec.bat
  • local, absolute, no drivespec: \autoexec.bat
  • local, relative: the\path
  • UNC: \\host\share\autoexec.bat
  • Do you map the UNC host name to the authority component? Don't forget to percent-encode.
  • Do you leave the UNC share name as the first segment of the path component, or..? And don't forget to percent-encode.

Exceptional UNC paths

Networked instances of Windows do weird things like refer to network printers like this: '\\http://192.168.0.1/printername', and refer to shared drives like this: '\\server\d$\autoexec.bat'. When are these conventions used? I saw the former today, and the latter a few years back on NT4 systems. Are they documented anywhere, and do you want to attempt to deal with them?

Update 2006-04-03: I found out that "$" at the end of a share name is a naming convention that causes it to be hidden from network browsers and 'net view'. [3] The format of a UNC path is \\server\share\path\filename.

Windows case normalization

For a Windows path, do you do any case normalization, e.g. in the drivespec? ('c:' -> 'C:')

Windows and colon characters

Windows uses ":" in the drivespec (and nowhere else, currently). ":" is a reserved character in a URI, but does not need to be percent-encoded in a path segment. Therefore, 'file:///C:/autoexec.bat' is acceptable as a URI, and is equivalent to 'file:///C%3A/autoexec.bat.

There is a convention of using "|", e.g. 'file:///C|/autoexec.bat', I believe because of the ambiguities that arise when you have situations like 'C:/foo' as a relative URL being resolved against, say, 'file:/autoexec.bat' or 'file:C:/autoexec.bat' and so on - things that appear in the wild and may(?) have been canon at one time, but don't play nicely with any relative resolution algorithms.

I haven't much sympathy for "|" and feel it should be deprecated as much as possible. Resolvers should continue to accept it and treat it as synonymous with a drivespec ":". On that note, though, should they treat all "|" as ":", or just those that appear to be a drivespec?

If ":" or "|" ever become legal characters in Windows paths… then what.

Empty path segments

Empty segments in the path: collapse them? Depends on OS?

This gets tricky round-tripping on Windows with UNC paths.. I'd have to experiment again to give you some good examples though. I decided not to worry about it too much).

Discogs related

MP3 related

Lossless MP3 editing on Windows

To be absolutely safe:

  1. mp3packer -b 320 -r inputfile.mp3 tempfile.mp3 to all but eliminate bit reservoir usage
  2. Edit tempfile.mp3 in mp3DirectCut
  3. mp3packer -s -t -z tempfile.mp3 outputfile.mp3 to recompress to VBR

However it's generally safe just use mp3DirectCut on the files, if all you're doing is trimming silence/noise from the ends.

There's another lossless editor called mpgedit. There's a console version (mpgedit) and a graphical version (xmpgedit), and it is available for multiple platforms. However, when I tried it on Windows, it didn't really work at all. Its editing interface isn't intuitive, and mp3DirectCut seems to just be a little better all around.

I used to use mpTrim for trimming silence from the ends, but the freeware version only works on smallish files, and it has nothing resembling a wave editor, so you have to 'earball' it. I now prefer mp3DirectCut.

MP3 SHOUTcast stream transcoding

Before I switched to AAC+, I was looking into offering a low-bitrate MP3 version of my audio stream by connecting to the regular stream as a client, transcoding it, and feeding it to another server. This isn't necessary when using recent versions of the SHOUTcast DSP for Winamp; it can feed multiple streams at different bitrates to different servers. But I was using an older DSP so that I could use an external MP3 codec, and had to come up with a workaround. I didn't get very far in my research, but did find several options for grabbing a stream to a file on Unix:

mpg123 -b 4096 -s http://212.72.165.24:9154/ > out.mp3
gnetcat -l -p 8081 212.72.165.24 9154 | lame --mp3input -b 64 - out.mp3
mpg123 -b 4096 -s http://212.72.165.24:9154/ | lame --mp3input -b 64 - out.mp3

And then there was also fIcy: an icecast/shoutcast stream grabber suite.

I don't remember if I tried any of these. The idea though was that if I could send the data to a file, the file could actually be a named pipe into icecast or something. However it didn't occur to me that I'd probably lose the protocol data (stream info and song titles). So it was starting to look like the only real option was going to be streamTranscoder, last updated in 2004.

I don't care anymore. I'm just keeping this info for possible future reference, because I remember it was hard to find some of this info.

Info about my old SHOUTcast setup

My previous setup used the 'Radium' pirated Fraunhofer MP3 codec at 96 kbps CBR, which provided excellent sound quality due to its lowpass filter discarding frequencies above 11.5 kHz, and its ability to force the use of L/R stereo mode rather than poorly-tuned M/S. MP3 codecs, especially Fraunhofer's, do very well at sub-128 kbps bitrates, especially when their lowpass filters are set to cut off the bandwidth-intensive higher-frequency audio.

Why didn't I use LAME? Well, I tried it several times, but for various reasons, I was required to use a codec packaged as Windows .acm file, and LAME's ACM build isn't nearly as tunable as the command-line or DLL versions. The biggest problem is that they've set its lowpass filter a bit too high, around 14 kHz, I think, with no way to change it. The result is inaccurate, noisy upper midrange, which is tolerable for speech, but that's too much high end to try to cram into a 96 kbps MP3 stream of music; something's got to give. They try to make up for it a bit by requiring joint stereo, which lets the encoder decide whether to use L/R or M/S in each frame. A lot of effort has gone into tuning LAME at higher bitrates so that it makes a good guess for that decision, but at 96 kbps CBR, it's no good; the result is too much noise and/or poor stereo separation. If I could do the encoding with LAME's command-line interface, it would be no problem, because I could just force L/R stereo mode and a ~12.5 kHz lowpass, which give it better high end than the Radium codec, with no degradation. If the stream wasn't limited to CBR, I'd use VBR and joint stereo, for which LAME has been tuned very well.

I use AAC+ now. Screw MP3.

Custom compiling AACGain

I'm going to try to build an SSE-optimized (Pentium III and up) version of AACGain. I'll trace my steps here.

Build instructions are in the AACGain README.

  • Download TortoiseCVS. Install it, choosing to restart Explorer when prompted. I was also prompted to restart Windows for changes to CVSNT to take effect. I chose not to do that because the setup hadn't finished!