2008Q3 Grant Proposal: Bavl
Fri, 01-Aug-2008 by
Alberto Simões
edit post
* **Authors:** John Beppu and Pip Stuart
* **Title:** Bavl (pronounced "bah-vell", as Tower of "Babel" in Hebrew)
* **Synopsis:** Bavl is a Free (GPLv3'd) web application for collaboratively learning how to comprehend && speak foreign languages. At its core, it is a system for searching through a database of words, phrases, or lessons each accompanied by translations. However, instead of just presenting mere text back, each phrase also can have one or more audio recordings associated with it. This will let people actually *HEAR* how to correctly enunciate unfamiliar words (even in a visitor's native language), including valid alternatives, such as the silent or hard "t" in the English word "often". This feature will provide a tremendous benefit to anyone interested in improving their spoken-language skills (i.e., Scholars, Pupils, Students, Learners, etc. of Languages && Social Sciences). In addition to learning, helping others to learn will be easy too. The UI is designed to encourage multi-lingual contributors to publicly offer literal or figurative translations, native pronunciations, answer questions, provide lesson plans, advice, && pupil feedback. Audio recording && playback will be implemented using embedded Adobe Flash-based widgets (via haXe) within the pages of Bavl.
**Authors**
John Beppu
Pip Stuart
**Title**
Bavl (pronounced "bah-vell", as Tower of "Babel" in Hebrew)
**Synopsis**
Bavl is a Free (GPLv3'd) web application for collaboratively learning how to comprehend && speak foreign languages. At its core, it is a system for searching through a database of words, phrases, or lessons each accompanied by translations. However, instead of just presenting mere text back, each phrase also can have one or more audio recordings associated with it. This will let people actually *HEAR* how to correctly enunciate unfamiliar words (even in a visitor's native language), including valid alternatives, such as the silent or hard "t" in the English word "often". This feature will provide a tremendous benefit to anyone interested in improving their spoken-language skills (i.e., Scholars, Pupils, Students, Learners, etc. of Languages && Social Sciences).
In addition to learning, helping others to learn will be easy too. The UI is designed to encourage multi-lingual contributors to publicly offer literal or figurative translations, native pronunciations, answer questions, provide lesson plans, advice, && pupil feedback. Audio recording && playback will be implemented using embedded Adobe Flash-based widgets (via haXe) within the pages of Bavl.
**Benefits to the Perl community**
Bavl is good for the Perl community, because it is good for all of humanity... but Bavl is especially beneficial for our particular subset of humanity; as Perl hackers, we share reverence of linguistic mastery, && as individuals, we tend to say, interpret, && do things in more than many ways that might seem equivalent, yet they remain subtly distinct. These common ideologies among Perl hackers seperates us from other programming communities, no matter our citizenship or native language - our distinctive features are reflected through the characteristics of Perl (our common language). We Perl hackers value our ability to cultivate && creatively express diverse visions within Perl's gracefully expansive language. We hope Bavl will encourage the celebration of pioneers of all linguistic frontiers, enabling neophytes and visionaries alike to increase knowledge && multiply wisdom fruitfully.
Humanity has always struggled to communicate effectively; our often celebrated && widely diverse cultures && languages lead to misunderstandings && inhibited progress because we lack a widely available, && contextually accurate, translation system. We consider Bavl to be a compelling solution to this fundamental problem, because of its inherent simplicity and ubiquitous applicability. As a universally accessible system, Bavl represents the dominant mechanism for language learning of the future.
If Bavl achieves even a small level of global success, it stands to dramatically improve international, intercultural, && interpersonal relations by lessening misunderstandings && cultural ignorance. The Perl community's affinity for linguistics could become a sort of contagious vaccination among all peoples. Someday, the masses might have fun adopting new languages, increasing vocabularies, && teaching philosophies with cross-culturally eloquent accuracy. Plus, Perl hackers would enjoy learning desired languages through an underlying system built primarily from their favorite programming language. This is the human aspect of Bavl's benefits.
On the technical side, Bavl will promote the awareness && understanding of a few new && interesting technologies.
Bavl will be using a Perl module called Continuity, due to its ability to handle many concurrent && long-lived HTTP connections. This provides the technical foundation necessary to implement COMET-based soft-realtime updates on web pages. Thus, Bavl would inform people of the existence of this useful Perl module && give them a potent example of how it can be put to great use.
Bavl will be using CouchDB for storage. Since CouchDB suggests it could become a compelling alternative to relational databases, we intend to determine whether certain application data-sets are substantially easier to express through CouchDB than with a traditional relational database.
Finally, Bavl will alert the Perl community to the existence of haXeVideo, which is a Free (GPL'd) streaming Flash media server. This will let the Perl community know that if you want to stream audio or video content via Flash, it can be done without purchasing a prohibitively expensive media server from Adobe. You still have to pay for bandwidth, though. ;-)
Our hope is that the entirety of the Bavl project will reinvigorate the Perl community && inspire other ambitious && prodigious projects to follow.
**Deliverables**
* a Bavl Perl module distribution will be uploaded to the CPAN
* Bavl will be deployed to HTTP://Towr.of.Bavl.Org/
**Project Details**
Its technical foundation is as follows:
* CouchDB will be used for storage.
* haXeVideo will be used for streaming Flash audio.
* haXe will be used to implement the Flash player && recorder widgets.
* Squatting will be used as the app's MVC framework, && it'll be running (or squatting) on top of Continuity.
* Locale::Maketext will be used for i18n && l10n.
* The Google Language API will be used to do some automatic translation of text (but this functionality may not make it to the first release).
The web application itself will start off with a fairly simple layout. The major controllers of this app are as follows:
* Home -- This will display a list of the most recently added phrases.
* Learning -- The vision for this is that you'd select 2 languages -- one that you know, && one that you'd like to learn. Then you'd be presented with a search prompt where you could ask the system how to say a phrase. When you get results back, you'll also get a chance to hold on to the phrase for future reference, && this will let you build a collection of related phrases.
* Lesson -- The collection of phrases you built while using the Learning controller can be turned into a Lesson. Lessons are a collection of phrases that should have a cohesive theme.
* Chat -- I eventually want to have some chat-like functionality, so that people who are interested in the same languages can actually chat with each other while they're using the site. It'll be like the Chatter Box on HTTP://PerlMonks.Org/ but way more responsive, because we'll be using Continuity.
* Profile -- If you register an account on the site, you'll get a profile page that gives you a summary of all that you've contributed to the system. It'll also present you with a feed of questions that you may be able to answer for other people (because you've told the system that you know how to speak a certain language).
**Project Schedule**
We've already started building the core pieces of this project. Beppu-san wrote an MVC framwork called Squatting, because he's a big fan of the Camping framework from the Ruby world, && we wanted to have access to a similar API for building Bavl. He's also written a non-blocking CouchDB client called AnyEvent::CouchDB that will be uploaded to the CPAN soon.
The major pieces that left are:
* (1 week) The Flash-based audio playing widget.
* (1 week) The Flash-based audio recording widget.
* (5-7 weeks) The actual web application that brings everything together.
* (1 week) Deployment of the web application.
**Biography**
Beppu: I am a programmer who has been using Perl for 10 years, now. However, for the last 2 years, I took a trip into the worlds of Ruby && JavaScript to see what they had to offer. I have recently returned to Perl, because I decided that it was time to implement the idea for a language learning site that my friend && I have been sitting on, && I believed that Perl (the language created by our favorite linguist/programmer) would be the most appropriate tool for the job.
Pip: I'm JAPH who loves to learn, dream, && design ambitious things that I don't always quite know how to do all the needed work for... which is why Beppu-san && I make a solid team. I've always hacked at HP48GX calculators, early x86 asm, low-level graphics && 3D, which led to my decade-long career in game-development. I have contributed Free Software to facilitate better tools, data-formats, && interoperability among games whenever I could. I too believe Perl is the most appropriate language for our Bavl project to bloom.
**Amount Requested**
$4,095.63
Comments (5)
Sounds like a really interesting project, but I'm not really seeing the benefit for the perl community.
Hi,
Can you please compare your project with http://www.livemocha.com/? Aren't there other projects where you could collaborate on?
Cheers,
Alberto (chair of TPF GC)
Nice project, but i agree with Leon - not perlish enough.
Hi, I'm Vee Schlais, producer of Bavl for John Beppu and Pip Stuart. I'm responding on their behalf because Pip composed a draft response of such magnitude that we felt it prudent for me to summarize it here. (please let us know if you'd like to experience unfiltered Pip ;)
Regarding Leon and FAGZAL's comments:
Bavl will benefit the perl community by providing the resources and forum to communicate more clearly no matter the hackers' native language. Bringing the international community together will increase the quality of all perl code by desegregating hackers and encouraging global collaboration regardless of the language. It takes effective communication to have a healthy community... besides, Bavl could someday encompass programming languages as well as spoken, which would certainly benefit the perl community. But that might be projecting too far ahead.
Regarding Alberto's comment:
Thanks for pointing out livemocha.com as we were not familiar with it. Before embarking on the Bavl project we researched existing language learning sites and found none adhered to our Free Software and open education philosophy. As an example, while livemocha appears to have a very polished (and well funded) site, they intend to provide neither free education nor software.
Alberto: Pip has also drafted a lengthy response to your e-mail about the significance of the requested grant dollar amount. And he wanted you to know he will post a reply to this forum soon.
If you have any further questions, or just want to chat, please post it here or send any one of us a direct e-mail.
Thanks -Vee
Thanks for asking about the requested dollar amount of Bavl's proposal, Alberto. I didn't figure you'd mind if I posted this response to the forums since others might have interest too.
$4095.63 is a 31337 g33|< h4x0rz way to round numbers to the nearest power of two minus one (as attested to in Stephenson's seminal SnowCrash). In perl:
`perl -e "print('$', 2**12 - 1, '.', 2**6 - 1)"`
Below is an expanded explanation that I originally drafted (on my BlackBerry, while spending the first week of August in Santa Barbara for my brother's wedding) in response to this question. It is presented here, in case some additional depth is desired that I'm obviously eager to expose. Hopefully I've explained things comprehensively. ;) -Pip
Since quantum computing hasn't been developed far enough for us to be rid of binary computing yet, we're still bound to binary (i.e., powers of two) computational systems. Strangely large integers gain otherwise unlikely significance among hackers due to their binary efficiency.
Perl indexes arrays && lists in a zero-based fashion, like uniform data-types with byte-sized offsets from the data-type's base-address in system RAM (i.e., Random Access Memory) which corresponds to an array name in C/C++... so the first element is at index: [0] && the last element is at index: [n - 1] where n is the total positive integer count of data elements. Perl can count list elements for the array `@example` by just using @example in scalar context. The final element is indexed as one less than the size so: $example[(@example - 1)] == $example[$#example] == $example[-1]. More concretely, start with:
my @example =('A', 'B', 'C', 'D', 'E', 'F');
... where @example in scalar context results in 6 (six, the integer size of the array of uppercase letters span from 'A' to 'F'). $#example results in 5 (five) as the zero-based index of that sixth element 'F'. Negative indices will generically index arrays backwards starting from the end. So, in this example, -1 (negative one) would be equivalent by also indexing that last scalar value of 'F'.
Unsigned (i.e., non-negative) integer data-types (i.e., uint) also start from value 0 (zero) && can hold a maximum value of: 2**n - 1 where n is now representing the total number of bits (which typically ascend in increments of eight, forming bytes). Therefore, one byte (i.e., eight bits) can hold 256 different values. This is because the two values a single bit holds, raised to the eighth power for the number of bits, is 2**8, which is also sometimes written with different notation as 2^8, or 2*2 * 2*2 * 2*2 * 2*2 or 4*4 * 4*4 or 16*16. These 256 values span from 0 to 255 for a byte or a single ASCII character (C/C++'s char data-type from before UniCode or many multi-byte character-sets had been developed or widely used) since zero is the first number && typically indexes the null character in text to terminate strings (char-arrays) or much less frequently as terminators for specialized files (EOF as EndOfFile), tapes (EOT as EndOfTape), or other particular data-streams.
Bytes : Bits : Values : Range
1 (one) : 8 : 256 : 0..255
2 (two) : 16 : 65,536 : 0..65535
3 (three) : 24 : 16,777,216 : 0..16777215
4 (four) : 32 : 4,294,967,296 : 0..4294967295
8 (eight) : 64 : 18,446,744,073,709,551,616 : 0..18 sextillian - 1 ;)
16 (sixteen) : 128 : huge : 0..huge - 1
32 (thirtytwo) : 256 : fscking psycho huge : 0..aw fscking hell - 1
You might also notice that the eight bits making up a byte can be themselves counted in 2**3 as 0..7, so rather counter-intuitively, multiples of three can exhibit binary signifigance too. I am an eccentrically strong proponent of applying Base64 (which I prefer consisting of 0..9,A..Z,a..z,.,_) to representations of almost every area of computing. B64 chars hold 2**6 values from zero to 63 in six bits. Two B64 chars occupy 12 bits for 4,096 values up to 4095. B64 chars can each easily look distinct (if one is sensitive enough to perceive && preserve case, like capital letters remaining larger && usually less curved && dangly than their lowercase counterparts). They also conveniently encompass enough values to represent separate fields of times (0..59 jinx,framez,secondz,minutez, or 0..23 hourz && TimeZonez) && dates (1..31 days,1..12 months, then encodings for years, centuries, etc.) which exhibit several advantageous && convenient properties over popular alternatives. Please see my CPAN modules: Math::BaseCnv && Time::PT for further reference.
B64 can handily represent standard playing cards (including jokerz) for games like Solitaire, BlackJack, && Poker. See Games::Cards::Poker. B64 also holds each of the red, green, && blue (RGB) color components of the palette registers of old full-screen VGA text-modes. I even like B64 compressions of phone numbers, credit-card numbers, && IP addresses (both 32-bit v4 dotted-quads && IPv6 which I think are 128-bits each && seem unwieldy as distressingly long strings of segmented hexadecimal && decimal characters in the examples I've seen, but I haven't used them much yet). Another potential benefit of B64 is the ability to transmit binary data like the raw digital music tracks on a CD directly within the simple text body of an e-mail or web-page. I've even encoded Rubik's Cubes in around 64 B64 characters! ;)
Due to the inherent nature of binary computing technology, these 2**n - 1 numbers represent maximum values in a minimum number of bits, so they maximize efficiency. My favorite SciFi book, SnowCrash by Neal Stephenson, relates hackers' penchant for selection of such binarily efficient numbers. Since Beppu-san && I have months of work to do to realize Bavl in some form we've envisioned, I accounted for our expected living expenses, moderate hardware && hosting expenses we'd likely incur, etc. then I separately rounded the dollars && cents components of the figure we'd seek to have granted... yielding:
$4095.63
Sorry if I've belabored the point but I was asked why && didn't know how much detail would be needed for my response to be considered thorough (or even sufficient). I had figured most Perl h4x0rz would intuitively recognize && understand why I chose those values but it's good to be reminded how idiosyncratic my methods tend to be, even among communities of my kin. Please let me know if further clarification or explanation would be beneficial.
Regarding whether the above describes "good reason" for strangeness or not, it seems that determination is now yours (assuming I've explained myself adequately).
Thanks very much.
Sincerely,
-Pip =)
------Original Message------
from: Alberto Simoes via RT <Bugs-Correspond@NetLabs.Develooper.Com>
reply-to: Bugs-Correspond@NetLabs.Develooper.Com
to: PipStuart@GMail.Com
date: Fri, Aug 1, 2008 at 4:10 AM
subject: [perl #57498] TPF Grant Proposal for Bavl Project by Beppu && Pip
mailed-by: NetLabs.Develooper.Com
Hi, Pip
Proposal received. Thanks.
> Amount Requested
>
> $4,095.63
Is there any good reason for this strange value? :)
Cheers
Alberto