Friday, January 27. 2012A New Mathematics for ComputingVia Input Output -----
We live in a world where digital technology is so advanced that even what used to be science fiction looks quaint. (Really, Kirk, can you download Klingon soap operas on that communicator?) Yet underlying it all is a mathematical theory that dates back to 1948. That year, Claude Shannon published the foundational information theory paper, A Mathematical Theory of Communication (PDF). Shannon’s work underlies communications channels as diverse as proprietary wireless networks and leads on printed circuits. Yet in Shannon’s time, as we all know, there were no personal computers, and the only guy communicating without a desk telephone was Dick Tracy. In our current era, the number of transistors on chips has gotten so dense that the potential limits of Moore’s Law are routinely discussed. Multi-core processors, once beyond the dreams of even supercomputers, are now standard on consumer laptops. Is it any wonder then, that even Shannon’s foresighted genius may have reached its limits?
From the Internet to a child’s game of telephone, the challenge in communications is to relay a message accurately, without losing any of it to noise. Prior to Shannon, the answer was a cumbersome redundancy. Imagine a communications channel as a conveyer belt, with every message as a pile of loose diamonds: If you send hundreds of piles down the belt, some individual diamonds might fall off, but at least you get most of the diamonds through quickly. If you wrap every single diamond in bubble wrap, you won’t lose any individual diamonds, but now you can’t send as many in the same amount of time. Shannon’s conceptual breakthrough was to realize you could wrap the entire pile in bubble wrap and save the correction for the very end. “It was a complete revolution,” says Schulman, “You could get better and better reliability without actually slowing down the communications.” In its simplest form, Shannon’s idea of global error correction is what we know as a “parity check” or “check sum,” in which a section of end digits should be an accurate summation of all the preceding digits. (In a more technical scheme: The message is a string n-bits long. It is mapped to another, longer string of code words. If you know what the Hamming distance—the difference in their relative positions—should be, you can determine reliability.)
Starting back as far as the late 1970s, researchers in the emerging field of communication complexity were beginning to ask themselves: Could you do what Shannon did, if your communication were more of a dialogue than a monologue? What was a theoretical question then is becoming a real-world problem today. With faster chips and multi-core processes, communications channels are becoming two-way, with small messages going back and forth. Yet error correction still assumes big messages going in one direction. “How on earth are you going to bubble wrap all these things together, if what you say depends on what I said to you?” says Schulman. The answer is to develop an “interactive” form of error correction. In essence, a message sent using classic Shannon error correction is like a list of commands, “Go Left, Go Right, Go Left again, Go Right…” Like a suffering dinner date, the receiver only gets a chance to reply at the end of the string. By contrast, an interactive code sounds like a dialog between sender and receiver: “Going left.” The underlying mathematics for making this conversation highly redundant and reliable is a “tree code.” To the lay person, tree codes resemble a branching graph. To a computer scientist they are “a set of structured binary strings, in which the metric space looks like a tree,” says Schulman, who wrote the original paper laying out their design in 1993. “Consider a graph that describes every possible sequence of words you could say in your lifetime. For every word you say, there’s thousands of possibilities for the next word,” explains Amit Sahai, “A tree code describes a way of labeling this graph, where each label is an instruction. The ways to label the graph are infinite, but remarkably Leonard showed there is actually one out there that’s useful.” Tree codes can branch to infinity, a dendritic structure which allows for error correction at multiple points. It also permits tree codes to remember entire sequences; how the next reply is encoded depends on the entire history of the conversation. As a result, a tree code can perform mid-course corrections. It’s as if you think you’re on your way to London, but increasingly everyone around you is speaking French, and then you realize, “Oops, I got on at the wrong Chunnel station.” The result is that instructions are not only conveyed faster; they’re acted on faster as well, with real-time error correction. There’s no waiting to discover too late that an entire string requires re-sending. In that sense, tree codes could help support the parallel processing needs of multi-core machines. They could also help in the increasingly noisy environments of densely packed chips. “We already have chip-level error correction, using parity checks,” says Schulman. But once again, there’s the problem of one-way transmission, in which an error is only discovered at the end. By contrast, a tree code, “gives you a check sum at every stage of a long interaction,” says Schulman. Unfortunately, those very intricacies are why interactive communications aren’t coming to a computer near you anytime soon. Currently, tree codes are in the proof of concept phase. “We can start from that math and show they exist, without being able to say: here is one,” says Schulman. One of the primary research challenges is to make the error correction as robust as possible. One potential alternative that may work much sooner was just published by Sahai and his colleagues. They modified Schulman’s work to create a code that, while not quite the full theoretical ideal of a tree code, may actually perform nearly up to that ideal in real-world applications. “It’s not perfect, but for most purposes, if the technology got to where you needed to use it, you could go ahead with that,” says Schulman, “It’s a really nice piece of progress.” Sahai, Schulman, and other researchers are still working to create perfect tree codes. “As a mathematician, there’s something that gnaws at us,” says Sahai, “We do want to find the real truth—what is actually needed for this to work.” They look to Shannon as their inspiration. “Where we are now is exactly analogous to what Shannon did in that first paper,” Schulman says, “He proved that good error correcting codes could exist, but it wasn’t until the late 1960s that people figured out how to use algebraic methods to do them, and then the field took off dramatically.” Thursday, January 19. 2012The faster-than-fast Fourier transformVia PhysOrg ----- The Fourier transform is one of the most fundamental concepts in the information sciences. It’s a method for representing an irregular signal — such as the voltage fluctuations in the wire that connects an MP3 player to a loudspeaker — as a combination of pure frequencies. It’s universal in signal processing, but it can also be used to compress image and audio files, solve differential equations and price stock options, among other things. The reason the Fourier transform is so prevalent is an algorithm called the fast Fourier transform (FFT), devised in the mid-1960s, which made it practical to calculate Fourier transforms on the fly. Ever since the FFT was proposed, however, people have wondered whether an even faster algorithm could be found. At the Association for Computing Machinery’s Symposium on Discrete Algorithms (SODA) this week, a group of MIT researchers will present a new algorithm that, in a large range of practically important cases, improves on the fast Fourier transform. Under some circumstances, the improvement can be dramatic — a tenfold increase in speed. The new algorithm could be particularly useful for image compression, enabling, say, smartphones to wirelessly transmit large video files without draining their batteries or consuming their monthly bandwidth allotments. Like the FFT, the new algorithm works on digital signals. A digital signal is just a series of numbers — discrete samples of an analog signal, such as the sound of a musical instrument. The FFT takes a digital signal containing a certain number of samples and expresses it as the weighted sum of an equivalent number of frequencies. “Weighted” means that some of those frequencies count more toward the total than others. Indeed, many of the frequencies may have such low weights that they can be safely disregarded. That’s why the Fourier transform is useful for compression. An eight-by-eight block of pixels can be thought of as a 64-sample signal, and thus as the sum of 64 different frequencies. But as the researchers point out in their new paper, empirical studies show that on average, 57 of those frequencies can be discarded with minimal loss of image quality. Heavyweight division Signals whose Fourier transforms include a relatively small number of heavily weighted frequencies are called “sparse.” The new algorithm determines the weights of a signal’s most heavily weighted frequencies; the sparser the signal, the greater the speedup the algorithm provides. Indeed, if the signal is sparse enough, the algorithm can simply sample it randomly rather than reading it in its entirety. “In nature, most of the normal signals are sparse,” says Dina Katabi, one of the developers of the new algorithm. Consider, for instance, a recording of a piece of chamber music: The composite signal consists of only a few instruments each playing only one note at a time. A recording, on the other hand, of all possible instruments each playing all possible notes at once wouldn’t be sparse — but neither would it be a signal that anyone cares about. The new algorithm — which associate professor Katabi and professor Piotr Indyk, both of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), developed together with their students Eric Price and Haitham Hassanieh — relies on two key ideas. The first is to divide a signal into narrower slices of bandwidth, sized so that a slice will generally contain only one frequency with a heavy weight. In signal processing, the basic tool for isolating particular frequencies is a filter. But filters tend to have blurry boundaries: One range of frequencies will pass through the filter more or less intact; frequencies just outside that range will be somewhat attenuated; frequencies outside that range will be attenuated still more; and so on, until you reach the frequencies that are filtered out almost perfectly. If it so happens that the one frequency with a heavy weight is at the edge of the filter, however, it could end up so attenuated that it can’t be identified. So the researchers’ first contribution was to find a computationally efficient way to combine filters so that they overlap, ensuring that no frequencies inside the target range will be unduly attenuated, but that the boundaries between slices of spectrum are still fairly sharp. Zeroing in Once they’ve isolated a slice of spectrum, however, the researchers still have to identify the most heavily weighted frequency in that slice. In the SODA paper, they do this by repeatedly cutting the slice of spectrum into smaller pieces and keeping only those in which most of the signal power is concentrated. But in an as-yet-unpublished paper, they describe a much more efficient technique, which borrows a signal-processing strategy from 4G cellular networks. Frequencies are generally represented as up-and-down squiggles, but they can also be though of as oscillations; by sampling the same slice of bandwidth at different times, the researchers can determine where the dominant frequency is in its oscillatory cycle. Two University of Michigan researchers — Anna Gilbert, a professor of mathematics, and Martin Strauss, an associate professor of mathematics and of electrical engineering and computer science — had previously proposed an algorithm that improved on the FFT for very sparse signals. “Some of the previous work, including my own with Anna Gilbert and so on, would improve upon the fast Fourier transform algorithm, but only if the sparsity k” — the number of heavily weighted frequencies — “was considerably smaller than the input size n,” Strauss says. The MIT researchers’ algorithm, however, “greatly expands the number of circumstances where one can beat the traditional FFT,” Strauss says. “Even if that number k is starting to get close to n — to all of them being important — this algorithm still gives some improvement over FFT.”
Tuesday, January 17. 2012HTML bringing to us old boot sessionsFor anyone of us who thinks that past was better... or to show to new comers that, some time ago, a computer device was not supposed to be always switched on! Wednesday, January 11. 2012Holographic codeVia johncook.com -----
In a hologram, information about each small area of image is scattered throughout the holograph. You can’t say this little area of the hologram corresponds to this little area of the image. At least that’s what I’ve heard; I don’t really know how holograms work. I thought about holograms the other day when someone was describing some source code with deeply nested templates. He told me “You can’t just read it. You can only step through the code with a debugger.” I’ve ran into similar code. The execution sequence of the code at run time is almost unrelated to the sequence of lines in the source code. The run time behavior is scattered through the source code like image information in a holograph. Holographic code is an advanced anti-pattern. It’s more likely to result from good practice taken to an extreme than from bad practice. Somewhere along the way, programmers learn the “DRY” principle: Don’t Repeat Yourself. This is good advice, within reason. But if you wring every bit of redundancy out of your code, you end up with something like Huffman encoded source. In fact, DRY is very much a compression algorithm. In moderation, it makes code easier to maintain. But carried too far, it makes reading your code like reading a zip file. Sometimes a little redundancy makes code much easier to read and maintain. Code is like wine: a little dryness is good, but too much is bitter or sour. Note that functional-style code can be holographic just like conventional code. A pure function is self-contained in the sense that everything the function needs to know comes in as arguments, i.e. there is no dependence on external state. But that doesn’t mean that everything the programmer needs to know is in one contiguous chuck of code. If you have to jump all over your code base to understand what’s going on anywhere, you have holographic code, regardless of what style it was written in. However, I imagine functional programs would usually be less holographic. Monday, January 09. 2012NoSQL’s great, but bring your A gameVia GIGAOM ----- MongoDB might be a popular choice in NoSQL databases, but it’s not perfect — at least out of the box. At last week’s MongoSV conference in Santa Clara, Calif., a number of users, including from Disney, Foursquare and Wordnik, shared their experiences with the product. The common theme: NoSQL is necessary for a lot of use cases, but it’s not for companies afraid of hard work. If you’re in the cloud, avoid the disk
As he explained, Wordnik actually launched on Amazon Web Services and used MySQL, but the database hit a wall at around a billion records, he said. So, Wordnik switched to MongoDB, which solved the scaling problem but caused its own disk I/O problems that resulted in a major performance slowdown. So, Wordnik ported everything back onto some big physical servers, which drastically improved performance. And then came the scalability problem again, only this time it was in terms of infrastructure. So, it was back to the cloud. But this time, Wordnik got smart and tuned the application to account for the strengths and weaknesses of MongoDB (“Your app should be smarter than your database,” he says), and MongoDB to account for the strengths and weaknesses of the cloud. Among his observations was that in the cloud, virtual disks have virtual performance, “meaning it’s not really there.” Luckily, he said, you can design to take advantage of virtual RAM. It will fill up fast if you let it, though, and there’s trouble brewing if requests start hitting the disk. “If you hit indexes on disk,” he warned, “mute your pager.”
EBS also brings along issues of its own. At least once a day, Bethea said, queued reads and writes to EBS start backing up excessively, and the only solution is to “kill it with fire.” What that means changes depending on the problem, but it generally means stopping the MongoDB process and rebuilding the affected replica set from scratch. Monitor everythingCurt Stevens of the Disney Interactive Media Group explained how his team monitors the large MongoDB deployment that underpins Disney’s online games. MongoDB actually has its own tool called the Mongo Monitoring System that Stevens said he swears by, but it isn’t always enough. It shows traffic and performance patterns over time, which is helpful, but only the starting point.
Oh, and just monitoring everything isn’t enough when you’re talking about a large-scale system, Stevens said. You have to have alerts in place to tell you when something’s wrong, and you have to monitor the monitors. If MMS or any other monitoring tools go down, you might think everything is just fine while the kids trying to have a magical Disney experience online are paying the price. By the numbersIf you’re wondering what kind of performance and scalability requirements forced these companies to MongoDB, and then to customize it so heavily, here are some statistics:
For more-technical details about their trials and tribulations with MongoDB, all three presentations are available online, along with the rest of the conference’s talks. Personal Comments: Here are some basics and information on NoSQL: Wiki, NoSQL Databases, MongoDB
(Page 1 of 1, totaling 5 entries)
|
QuicksearchPopular Entries
CategoriesShow tagged entriesSyndicate This BlogCalendarBlog Administration |
