Planet Mozilla Interns: David Tran: Growing up in BASES [Source: Planet Mozilla]Zack Weinberg: The Conference Formerly Known as Oakland, day 3 This day had a lot of interesting papers, but some of the
presentations were disappointing: they spent their time on
uninteresting aspects of their work, or handwaved over critical
details.
That said, most of the work on passwords was compelling, and if you
read to the end there’s a cranky rant about the panel discussion.
Privacy and Anonymity
I missed the train that would have gotten me to SF in time for this
talk by one lousy minute.
These folks had a set of client-only modifications to Tor that,
they claim, both improve latency and add robustness against “profiling
attacks.”
Their latency-reduction shtick is to record propagation delay in
between all Tor relays, and then choose paths with probability
inversely proportional to overall delay; they call this “Weighted
Shortest Path” selection. This does increase the risk of using a path
with compromised relays; they allow the user to tune the
proportionality constant to trade off anonymity for latency or vice
versa, and they also group relays into geographic clusters for WSP and
choose randomly within a cluster. They didn’t attempt to analyze the
odds of getting a compromised relay, that I saw, though. They claim
20% improvement in median latency—that’s relative to the baseline
Tor algorithm, which imposes a 500% median latency penalty relative
to unmasked web surfing.
They define “profiling attack” as: unmasking anonymous Tor users by
correlating traffic to the entry node with traffic from the exit node.
Because Tor relays are not uniformly distributed over the network’s
topology, this is plausibly possible for an AS operator, who may control routers that sit between
the Tor client and the entry, and routers that sit between the Tor
exit and the destination site. Of course there is nothing to be done
when the ultimate source and ultimate destination are in the same AS,
but that’s unusual. By default, Tor ensures that no two relays are in
the same /16, which (they say) misses 50% of shared ASes. AS-path
prediction is known to be hard, with the best known approaches only
70% correct; they claim an improvement to 90% (median) but glossed
over the details.
An audience member asked: if all clients switch to WSP, won’t that
overload the relays on the shortest paths? The speaker did not appear
to understand the question.
Database searches reveal sensitive information. This paper surveys
one class of techniques to prevent this, which is based on generating
“dummy queries” to obscure what was really searched for. They find
all existing techniques wanting, and offer a systematization of
“features that should be considered in security evaluation” of this
type of design, hoping that people will do better in the future.
I couldn’t make much sense of this talk because the speaker talked
very fast, the slides were in illegibly tiny type, and I’m not
previously familiar with the topic. However, the basic problem seems
to be that it’s not terribly hard to distinguish dummy queries from
real queries, especially for an adversary who knows what system is in
use.
These folks are also trying to improve on Tor’s latency issues, but
they’re doing it by weakening the adversary model: they only care
about protecting against remote tracking by the destination host of
any given TCP connection. To this end they have developed their own
mixmaster scheme which only conceals the initial prefix of the route
from the client to the server. They also go to some length to avoid
increasing the overall path length, minimize cryptographic operations,
and minimize state carried at intermediate relays. They claim an
anonymity set on the order of 216 client-hosts by
concealing the first hop; 224 with two hops, 228
with three, and not much improvement beyond.
They do not try to protect against application-layer tracking (by the
destination host or otherwise), timing attacks, or detailed traffic
analysis a la “Peek-A-Boo, I Still See You” from yesterday. It
seemed like there were a lot of technical details to be worked out,
and I’m also not clear on why the destination host wouldn’t just go
ahead and do application-layer tracking, as they do already even when
you’re not bothering to conceal your IP address.
Passwords
This is an analysis of offline attacks on large files of passwords.
Attackers have a large set of encrypted passwords, and can mount many
guesses; there are various “intelligent” strategies in the literature
for the order in which to guess. How well do these strategies work?
System administrators try to encourage users to make passwords harder
to guess, by imposing “password composition policies”. How well does
that work? Most such policies are ad hoc, not based on empirical
evidence. They also often make the passwords harder to remember,
which can hurt security in other ways.
They use “guess number” (the number of attempts required to guess a
password, assuming a fixed list of guesses tried in sequence) as their
metric, and allow their model attacker to make up to 50 trillion
(short scale; 5
× 1013) guesses, using precomputed guess-number indexes for
three different attack strategies, of which
Weir’s probabilistic context-free grammar
does best on real passwords. For data, they used several large leaked
password databases, plus passwords solicited from Mechanical Turk
users under various password-composition policies.
Some conclusions:
- Policies’ performance when attackers are limited to 108
guesses does not predict their performance when attackers can make
1013 guesses
- Training data conforming to a particular policy helps the attacker a
great deal
- Subsets of test data are not reflective of the general population;
some users just naturally pick better passwords under all policies
- NIST guidelines for password policies do not reflect actual password
strength under each policy
and
- Requiring longer passwords, but not being picky about what’s in
them, produces both better usability and better security.
This is another analysis of password guessing difficulty, but focuses
on user demographics rather than password policy. They collected
keyed hashes of 70 million users across various Yahoo-owned sites and
modeled their guessability as a probability distribution (“mean
number of guesses required to succeed with probability α”). Their
main conclusion is that most of the demographic splits they tried
don’t make much difference: in particular, older users seem to pick
slightly better passwords than younger users, and it doesn’t matter
whether the account has a credit card number on file. Also, 42%
of all passwords collected were unique, which is better than I would
have expected.
Lots of people have been trying to replace passwords entirely for some
time: why haven’t we succeeded? The authors analyzed what has already
been done and why it didn’t work, and came up with a methodology for
evaluating new systems. The talk is mostly about the details of the
methodology, in a vacuum; I’d rather they had taken more time
describing how specific schemes measured up, and then drawn some
inferences from that. Perhaps this is covered in the paper.
I did see one interesting takeaway: deployability has been neglected
over and over again (despite its obvious relevance).
System Security
This is an attempt to thwart
“return-oriented programming”
code injection attacks, by randomizing the location of every
instruction in memory. (The state of the art, “address space layout
randomization,” moves instructions around in large blocks; if the
attacker can find one address in the block, they can find
everything.)
They’re doing this on binaries, so they can’t be absolutely sure where
the instructions are, which pointers are pointers to code, etc. But,
it’s okay to dump some random might-be-an-instruction bytes in some
random places, because they just won’t get used. They statically
translate the binary to an “extended instruction set,” whose
instructions can be arbitrarily rearranged, because each contains the
location of the next instruction (which makes me think of
The Story of Mel).
Then, a dynamic-translating VM executes these extended instructions on
real hardware. More than 95% of instructions can be moved, and there’s
a 13% slowdown.
They achieve 31 bits of entropy, out of a 32-bit address space, in
each instruction’s location. (I presume the other half of the address
space is where they put all the program’s data.) They assured us (but
offered no explanation) that the attacker cannot get at gadgets within
the VM runtime or the dynamically re-translated code that’s actually
getting executed, and they demonstrated the defeat of a real-world ROP
attack on Adobe Acrobat.
If you have a whole bunch of virtual machines running on shared
hardware, you might want to set up a special supervisory VM that can
audit the state of all the others. Unfortunately, all the stock
diagnostic tools are designed to run inside the (possibly virtual)
machine that is being diagnosed. From outside, you are working on a
frozen snapshot, so you have to write code that digs through the
machine’s memory space without benefit of any of the normal diagnostic
APIs or even normal address translation logic. It’s difficult,
tedious, and duplicative.
These people developed a special framework that can run stock
diagnostic tools against a snapshot rather than a live VM. It was
unclear how this got them out of digging through the snapshot. The
talk was mostly a big long list of confusingly-described minor
problems that they claimed to have solved.
This is a similar concept to two talks back, but these people don’t want
to have to run a dynamic translator, so instead, they scramble up the
binary in place, ahead of time. They don’t move code around at large
scale; the idea seems to be to eliminate byte sequences that comprise
ROP gadgets, rather than make it hard to find them. They gave several
worked examples of their transformations. It basically amounts to
redoing the late stage of a compilation pipeline: select alternative
instruction encodings as long as they’re the same length; within a
basic block, reschedule instructions as long as dependencies are still
honored; shuffle the order of register saves; redo the register
allocation. Overall, they claim to be able to eliminate 80% of all
ROP gadgets, with the remaining 20% due to not being able to prove
that the entire contents of the compiled-code segment of the binary
image are in fact code, rather than data. But this is good enough to
break two state-of-the-art exploit generators (Mona and Q)
After the talk, I asked them if they’d considered modifying a compiler
to avoid generating gadgets in the first place. They said several
other people had already done that
(e.g. “G-Free: Defeating Return-Oriented Programming through Gadget-less Binaries”),
so they wanted to see if it was doable with no more information than
you get from an executable.
A “trusted path” is a secure channel from some process to the user,
so that the user can be sure that confidential information goes
straight to its intended recipient. This was difficult to achieve
even on the simple TTY-oriented computers of yesteryear and is
nigh-impossible now, because there’s lots more system components in
between the input devices and the desired destination of user input.
The authors have developed a bolt-on trusted path that works together
with the commodity OS of your choice (quite possibly “as long as that
choice is Windows”, however). It consists of a very simple
hypervisor, whose main function is to provide device isolation: none
of the other devices on the system can be allowed to tamper with the
device hosting the secure channel. That device is given over
temporarily to a trusted device driver embedded in the program that
uses the secure channel. Finally, an external verification device
interrogates a TPM to confirm that all the software involved in the
trusted path is as it should be. They claim that this design
minimizes trusted components and allows the code that’s both trusted
and highly privileged to be as simple as possible.
This is clever and all, but it seems to me that if you have to have an
external verification device, you maybe should just pull the
confidential interaction out to that device and use the computer
only to transport messages that are already secured.
Panel Discussion: How can a focus on “science” advance research in cybersecurity?
The thinking behind this prompt appears to have been: We’ve been doing
“cybersecurity” “research” (for some value of both terms) for
thirty-three years at this conference and quite a bit longer
elsewhere, and yet we do not seem to be getting anywhere in terms of
actual security improvements delivered to the general public, or even
to the funding agencies (read: the US military). Clearly this means
we are not being Scientific enough. What’s wrong and how do we fix
it?
As you might suspect, I don’t much agree with any of the above
premises or their presuppositions. Yeah, we haven’t reduced computer
security to well-understood engineering practice yet. We’re still
doing messy, methodologically unscrupulous, exploratory science. To
an observer not directly engaged in the work, this can look like
flailing around in the dark, and it can be frustratingly slow, but if
you go back to the pioneering work in any scientific field, you see
the same sort of thing. This is how exploratory science goes. That
isn’t to say the way we do our research is ideal, but I think the real
problems are just the well-known dysfunctions of the modern academic
enterprise and its funding sources.
We do have a genuine problem with translating what solid results we
have into real-world security improvements. We have known how to
design ciphers that resist all known attacks for some years now, and
yet—as seen in the
case of the satellite phones
at this very conference—people in industry are still deploying
ciphers that are trivially broken. This I believe should be blamed on
the overlapping well-known dysfunctions of what my pinko commie
friends call “terminal-stage capitalism.” Conversely, industrial
folks have a legitimate beef with academics doing research that has no
possible relevance to the real world at all. I didn’t see any of that
at this conference, except maybe the “Why Johnny Isn’t Robust” guys
from yesterday, but it does happen. (I think as CS subfields go,
security is generally better about staying relevant than
e.g. programming language design.) And we shouldn’t forget that we
are, in fact, doing better now than we were ten years ago. TLS is a
pile of hacks on top of a flawed design but mostly it works.
Okay, I’ve dumped on the prompt enough, how was the panel? It was a
whole lot of words saying not much at all. Which is what you get when
you start with the wrong question—and I think most of the panelists
knew in their bones it was the wrong question—and then make people
talk for five minutes each. I’ll admit it was entertaining when they
asked for audience questions and a whole row of people brought up
their most favorite axes and started grinding. [Source: Planet Mozilla]Naoki Hirata: Screenshots on your android device … without ADB … without another App! In my quest to make life easier for Eng people and the end users… I came across an interesting thing when I was looking at the /system/bin directory of ICS (side note: adb shell points to files in /system/bin) …. there are two executables that were listed : screencap and screenshot
As curiosity arose, I decided to look on the net to see what I can find about screenshoting without a need for an app. Sure enough I stumbled upon this article : http://alexsleat.co.uk/2011/11/28/how-to-take-screenshots-in-ice-cream-sandwich-android-4-0-x/
You press up up down down left right left right b a select start. Oh wait. that’s a Konami code. In all seriousness, you hold down the Volume Down + Power Button for about a second for ICS devices.
Turns out that there’s other devices that can do this as well:
http://androidcommunity.com/samsung-galaxy-s-ii-built-in-screen-capture-no-root-required-20110514/
“Hold down the home and power button on the Samsung Galaxy SII.”
“The Droid Charge requires holding back, and hitting menu.”
I confirmed the Galaxy S II and Nexus tips. I haven’t confirmed the Droid Charge, but I get the feeling that it’s correct.
Here’s some other tips and tricks that you can do on the Galaxy Nexus : http://www.gottabemobile.com/2011/12/15/10-neat-tricks-and-tips-for-the-samsung-galaxy-nexus-on-verizon/
I haven’t found anything in regards to device video cap on ics, but when I have time I guess I might end up playing around with that option + the dev option of tapping.
Filed under: mobifx, mobile, Planet, QA, QMO [Source: Planet Mozilla]Benjamin Kerensa: Tunneling IPv4 Traffic over DNS on Ubuntu 12.04 Iodine is an open source application that has a client and server which in combination will allow a client to tunnel their IPv4 traffic over the DNS protocol and potentially bypass some censorship of traffic on the LAN or even WAN.
Installing and setting up Iodine on Ubuntu 12.04 LTS Server is a snap with these instructions which are intended to show how simple it is to get Iodine working on Ubuntu 12.04 and to show what a cool application Iodine is.
1. Install Iodine on both the client machine and the server:
sudo apt-get install iodine
2. Start the Iodine server:
 iodined -f 10.0.0.1 -P blah test.com
3. Start the Client:
 sudo iodine -f -r 50.116.11.150 -P blah test.com (replace 50.116.11.150 with your server ip)
4. Do Ping/Traceroute or even use mtr to ensure everything is working:
 ping 10.0.0.1 to ensure iodine is working (notably I have no devices on my LAN using 10.0.0.1)
You now have a working Iodine install but obviously in a more through setup you will want to configure Iodine to use something other than test.com and you will want to set up a relaying nameserver and perhaps route all traffic through iodine and the instructions to do this are here and pretty easy. [Source: Planet Mozilla]Jen Fong-Adwent: In Search of Prototyping I’ve been thinking about prototyping. And web applications. And applications that aren’t IDEs. And open web.
That’s a lot of things to think about - but there is also a lot lacking in the solutions we are currently given. For instance, let’s say I am a person who needs to build concepts quickly to test out product ideas but I may only know a little bit of code, a little bit about Photoshop and a little bit about wireframing. What are my options?
I could learn about programming from outputting ‘hello world’ in some arbitrary scripting language to building an entire CRUD system and then take some unknown amount of time to learn someone’s proprietary IDE that hopefully increases my productivity in building apps. Does something like this make sense for me?
The interface is just too confusing and complicated to get anything done right away. Sure, I could read all the documentation, search around online for various tutorials and sources - but then I would just be cornering myself as a potential specialist in this proprietary software. Let’s try something else.
I just want to drag and drop things - and have it generally look as expected immediately. I don’t want to read all this text about how to use it. I just want to use it. How about something like this?
Ahh, this interface is so much nicer to work with. There is one unfortunate thing - there is no actual working code in the back. This is literally a mockup - a ghost of a Photoshop file with all the common web-elements we know and love but lacking the design. This is done on purpose to keep the user from treating it as the real thing, but is this good enough? What if we could have the best of both worlds - prototype-like user execution that actually works with real code without having to write all that code or read mountains of documentation on how the system works?
Let’s look at OSX’s Automator for some clues on how we can bridge the gap:
Here we have something that you can drag and drop, is easy to understand AND actually generates some functionality.
What if we could combine elements of all of these examples and make it web-based, open source and easy to use? What would it mean in terms of optimizing workflow from product to development to production?
These are the questions I want to investigate so that we can make a tool that helps people build functioning prototypes and remove the ambiguity that occurs during planning, sketching and iteration. [Source: Planet Mozilla]Ben Hearsum: Update on Mac Code Signing #2 In my previous post I said that Mac signing had been turned on, and that it would stay on. Unfortunately, the following morning we caught some issues that only came up after updating. It took a bit of time to resolve those, but as of this morning we got them worked out, and signed mac builds are back. There are a couple of minor issues to address, but we’re in a good enough state to backport signing to Aurora and Beta, and ship it in Firefox 13.0. On Saturday evening I’ll be enabling it on Aurora, and on Monday morning on Beta – assuming no new issues come up.
If you see ANY issues related to updating on Mac please file a bug and cc me (:bhearsum). It’s very important that any issues are brought up immediately, as we intend to ship this to the release channel on June 5th.
Huge thanks (again) to Steven Michaud, Ted Mielczarek, Erick Dransch, and anyone else that helped make this happen. [Source: Planet Mozilla]Madhava Enros: Here's a thing that would be awesome Here's a conversation I just had in IRC. Is anyone interested in building this?
madhava: hey
madhava: would anyone like to build me an addon where when I enter some hotkey combination
madhava: will bring up some sort of HUD awesomebar
madhava: that I can search through
madhava: and then when I hit enter it will insert the matching URL in my current textfield
madhava: or put it in the copy buffer?
madhava: I reference a lot of URLs in emails and bugs
dietrich: quickfind-that-url. that'd be nice.
madhava: and I have to open a new tab, use the awesomebar, select, copy, switch back, paste
madhava: dietrich: yeah
mitchell: history viewer?
madhava: mitchell: sort of -- but too heavyweight
dietrich: i do the same thing. have to navigate open tabs, all kinds of crap to find a url.
madhava: in some ways, even a dropdown as soon as I type http:// would do it
madhava: like in a IDE
madhava: but then I'd have to type http
mitchell: I just begin typing the url to have it come up, hit down, hit right, hit ctrl a, ctrl c
dietrich: all the pieces are there for doing this
dietrich: it works even with open tabs
dolske: sounds like dietrich knows how to do this... ;)
dietrich: hotkeys + panel + Awesomebar sarch (from https://github.com/mozilla/addon-sdk/wiki/Community-developed-modules)
dietrich: dietrich cannot do this :)
madhava: maybe I'll blog this
dietrich: jono and gozala and i were just lamenting the lack of Ubiquity, which could easily do this :(
mitchell: what is hot key / hotkeys?
dietrich: mitchell: jetpack build-in api for registring keyboard combos with function callbacks
mitchell: dietrich: thx. mind sending link to api doc?
dietrich: https://addons.mozilla.org/en-US/developers/docs/sdk/latest/
madhava: ok - dietrich, mitchell, dolske - I'm more or less going to paste this conversation into a blog post
madhava: any of you want to be anonymous?
dietrich: clipboard api is the final piece, and that's built-in too
mitchell: I don't mind my nick going in
mitchell: dietrich: that's what I wanted to find out
dietrich: anonymity is for the anonymous
dietrich: this is a public channel, it's too late!
If you are, you can email me , or @madhava on Twitter. [Source: Planet Mozilla]Edward Lee: Copy Selected urls from the AwesomeBar I happened to switch to #fx-team to see madhava asking for an easier way to copy/paste urls from the AwesomeBar into the page. So whipped together something to do just that!
Just switch to the location bar by pressing ctrl/cmd-L, start searching, highlight the result you want, and press ctrl/cmd-enter. The url will be in the clipboard and automatically pasted to wherever you left off in the page.
And of course this works with Enter Selects, so you don’t even need to press down to copy the first result. Enter Selects automatically highlights it, so you can type out the page you want, and directly hit ctrl/cmd-enter and you’re done!
Try out Copy Selected or check out the code on github. (This is neater than I expected! I just used the functionality 3 times in one post. )
No comments [Source: Planet Mozilla]Firebug Blog: Firebug 1.10a9 getfirebug.com has Firebug 1.10a9
Firebug 1.10a9 fixes 19 issues.
Some highlights from this release:
- Support for :focus CSS pseudo class (issue 3407)
- Support for keyboard navigation within completion list (used in the command line) (issue 3660).
- Better clipboard copy for styles in the Style side panel (issue 5461)
Please post feedback in the newsgroup, thanks.
Jan ‘Honza’ Odvarko [Source: Planet Mozilla]Matt Thompson: CoderDojo teams up with Mozilla for “Summer Code Party” 
CoderDojo is coming to the party. Are you?
As we mentioned with Tuesday’s launch of Mozilla Webmaker, Mozilla is inviting people and partners around the world to teach and learn the web through our Summer Code Party.
One of those partners is CoderDojo, a growing international movement to create “code clubs” for youth around the world. CoderDojo founder James Whelton joined our Webmaker community call to tell us what they’re bringing to the big Summer Code Party — and why teaching youth tech matters.

What’s Coder Dojo all about?
CoderDojo is a movement of free coding clubs for young people. Begun only eleven months ago in Ireland, they now boast over 70 Dojos worldwide. Initially conceived as a “fight club with keyboards,” organizers discovered quickly that their events were booking up “faster than Ireland’s most popular boy band concerts” and attracting almost as many girls as boys.
James describes it as”the boys and girls scouts of coding.” Youth learn how to code, develop websites, apps, programs, games and more. By matching web developers with enthusiastic novices aged 7-18, Coder Dojo mentors help kids develop problem solving skills, show off their work, and gain access to a supportive online network.

Why is Coder Dojo joining the Summer Code Party?
“It’s really important that the kids get exposed to contemporary companies and technologies,” James said. “And we want to broaden our global network.”
What do they have planned?
- A series of upcoming Code Jams
- National and international tournaments
- A belting and badging tie-in. Based upon achieving mastery of various skills and webmaker projects.

Get involved
[Source: Planet Mozilla]Pedro Alves: CDV - Request For Comments Community Data Validation
MotivationsWe need a way to do data validation Use casesBelow are some of the use cases we want to tackle. Emphasized are the ones we think the current spec satisfies - Global
- Is the server running?
- Is the server running properly?
- Connectivity
- Do we have all the access we should? (network / database)
- Query specific
- Do we have up to date data?
- Can we trust the data?
- How long did the queries take to run?
- Do we have wrong data? (duplicated users in community)
- Do we have a big number of 'unknowns'? (tk=1 in DW)
- Do we have peaks or valleys in the data? (due to double process or no process)
- Is the data stalled? (eg: number of twitter followers not updating)
- Did the data format change
- We need a way to handle known effects (eg: Christmas dip)
- We need to correlate independent datasources (eg: comparing AUS with Blocklist evolution)
- We need to connect the long running queries from CDA to CDV
- Be able to validate big chunks of reprocessing
- Do we have clearly wrong lines in resultset? (eg: a line there)
- Dashboards
- Are the dashboards rendering properly?
- Do we have all the components?
- Any layout change?
- Any js errors?
- Are the dashboards performing properly?
- Can CDF talk with CDV to report client-side errors?
- Alternatively, can CDA talk with CDV to report query errors?
- ETL
- Did the etl run?
- Are we processing the expected amount of data?
- Is the etl taking the expected time to run?
- Did the etl finish before X am?
- Test etl against tracer bullets?
Work flowWe expect from this system: - A central dashboard that allows us to quickly glimpse the overall status of our system.
- Did all tests pass?
- Which one failed?
- Why?
- When was the last time the tests ran
- We need simple ways to define the tests (based on existing CDAs)
- We need to which queries failed and which queries took long time to run
- We need push notification system by email
- We need to make sure it can talk to nagios
- We need an outside test to check if server is up
Logging typesEvery test will result in the following levels: Each specific test will be responsible for converting the output of that test (validation function for cda, tbd for kettle) into that status. The object format is: { level: "Critical", type: "Missing data", description: "Whatever string the user defined" }
On each test definition, we need to be able to optionally set a timing threshold for the queries, and that will automatically generate a log with Type 'Duration' Test typesThere are 4 possible types of tests: - CDA based query validation
- Datawarehouse validation (a specific set of the cda based query validation)
- Dashboard validation (we may opt to leave this one out for now as we'll try to infer the errors from CDA's 405)
CDA based queryWorkflowWe want to select one or more cda / dataAccessId from our system, define the input parameters and select the type of validation we need. The shape of the function will be: f( [ query, [params] ], validationFunction ) The generic test will be the implementation of the validation function: validationFunction = function ( [ {metadata: [] , resultset: [[]]} ] ) : value
That will be freely mapped to the log outputs ETL monitoring queryThe workflow defined here has to match with the previous section. We'll build specific CDA queries that will read the kettle log files. From that point on, specific validations will have to be built for this logs We'll need, in pentaho, to define which connection refers to the kettle logging tables. Either by defining a special jndi or specifying in the settings. We'll need to test for: - Time
- Start /end time
- Amount of data processed
Datawarehouse schema validationThere are some specific tests we can do on the sanity of a datawarehouse. - Coherent amount of data on a daily / hourly basis
- Test the same as before with specific breakdowns
- Test for the amount of 'unknowns' on dimensions
Invocation and SchedulingThere are 2 ways to call the validations: - By url request
- Scheduled calls
Url will be based on the id / query name (tbd). The schedule calls will be cron based, with the following presets: - Every hour
- Every day
- Every week
- Every month
- Custom cron
User interfaceThis are the features in the main user interface (this is the ultimate goal, the implementation may be broken into stages): - See existing validations
- Allow to fire a specific validation
- Get the url of a specific validation / all validations
- Create / Edit validation
- Define query name
- Define queries and parameters
- Define validation function
- Chose log alerts (when to throw error / severe / warn / ok)
- Choose duration thresholds
- Define error message
- Define cron
- Validation status dashboard
- CDA Query error dashboard (Should this belong to CDA instead?)
- Query and parameters
- Error
- Incidents
- Duration dashboard to identify slow points in the system
- Query and parameters
- Duration
- Incidents
Technical approachAll the specific information will be stored in solution/cdv/queries/). The files will have the format _queryName.cdv and will internally be a JSON file with the following structure: { type: "query", name: "validationName", group: "MyGrouping" validation: [ { cdaFile: "/solution/cda/test.cda", dataAccessId: "1" , parameters: [...] }, { cdaFile: "/solution/cda/test2.cda", dataAccessId: "2" , parameters: [...] } ], validationType: "custom", validationFunction: "function(resultArray,arguments){ return 123 }", alerts: { /* This functions will be executed from bottom up. As the functions return true, the next one will be executed and the last matching level will be thrown. The exception to this rule is the optional okAlert(v) function. If this one returns true, no other calls will be made */ criticalAlert: "function(v){ return v > 10 }", errorAlert: undefined, warnAlert: "function(v){ return v > 5 }", okAlert: "function(v){return v<3;}", alertType: "MissingData", alertMessage: "function(level,v){return 'My error message: ' + v)" /* this can either be a function or a string */ }, executionTimeValidation: { expected: 5000, warnPercentage: 0.30, errorPercentage: 0.70, errorOnLow: true },
cron: "0 2 * * ? *" }
Preset validationsWe won't need to manually define all kinds of validations. CDV will support a preset that can also be extended by adding the definitions to solution/cdv/validationFunctions/ . The template for one such Javascript file looks like this: wd.cdv.validation.register({ name: "Existence", validationArguments: [ {name: "testAll", type:"boolean", default: true} ],
validationFunction: function(rs, conf) { var exists = !!conf.testAll;
return rs.map(function(r){return r.length > 0}).reduce(function(prev, curr){ return conf.testAll ? (curr && prev) : (curr || prev); }); },
alertArguments: { {name: "failOnExistence" type: "boolean", default: true}, {name: "failureLevel", type: "alarmLevel", default: "ERROR"}, {name: "failureMessage", type: "String", default: "Failed Existence Test: ${result}"} },
alertMapper: function(result, conf) { var success = conf.failOnExistence && result, level = success ? "OK", conf.failureLevel, message = success ? conf.successMessage : conf.failureMessage; return Alarm(level, message, result); } });
The wd.cdb.validation API is defined in the Validation Module. There are 5 objects there that we need to analyze: validationFunction(rs, conf) - This is the validation function that will be executed after the query runsvalidationArguments - Definition of the arguments that will be used within the validation functionalertArguments - Definition of the arguments that will be sent to the alertMapperalertMapper(result, conf) - Mapping between the validation result and the alerts
Preset validations or custom validationsWhen we define a query, we can chose which validation function to use and pass the parameters that specific validation requires. Alternatively, we can use a custom validation function. That validation function has the following format, where all we need is to return the alarm level (this is a spec, may change after implementation) function(rs, conf) {
var exists = rs.map(function(r){return r.length > 0}).reduce(function(prev, curr){ return conf.testAll ? (curr && prev) : (curr || prev); });
return exists ? Alarm.ERROR : Alarm.OK; }
CDA integrationWe need a tight integration between CDA and CDV to report: - Errors in CDA queries
- Long running CDA queries
- Queries with obvious errors in the structure (eg: missing lines)
It will obviously need to take into account the fact that CDV may not be installed and can't have performance impacts in CDA External interfacesWe can have several external interfaces supported: - Email
- Http
- Nagios integration
- Server up check
The last one is a very specific check. All the other integrations will fail if suddenly the server hangs, and we must be notified of that. On http and nagios integration, we'll be able to get reports on the individual tests and also on the test groups. This will not rerun the tests but get the report on the last status of a test. On the http case, we can pass a flat to force a test to be rerun. For nagios, we can have an export of test rules SettingsWe'll be able to define the group rules, mainly for connectivity reasons. So the settings (that later can be converted to an UI), will look like this: Sms is in here by example but not planned to be supported [Source: Planet Mozilla]Mitchell Baker: The Ada Initiative Advisory Board Changes The Ada Initiative was founded early in 2011 when Val Aurora and Mary Gardiner decided to take the plunge, quit their jobs and found an organization dedicated to supporting women in open technology and culture. I had met Val a few months earlier and was very excited by their plans. When asked, I was very pleased to join the Advisory Board. It’s rare that I join advisory boards but in this case I was happy to do so.
Just over a year later TAI will be holding its second major event AdaCamp in Washington DC shortly, has been very active in bringing an anti-harassment mentality to open technology and culture gatherings, and has provided consulting assistance to a number of organizations. In the past few days TAI has been approved as a tax-exempt organization in the United States, an important marker in the start-up phase. The visibility of TAI is growing, and the Advisory Board has grown to 19 people representing a broad range of open technology and culture expertise. One of the additions is Lukas Blakk, also of Mozilla. In addition, Caroline Simard recently joined The Ada Initiative’s Board of Directors. I’m quite familiar with Caroline’s work at the Anita Borg Institute and at Stanford. She brings a great deal of expertise in both the research about women in the workplace and in the practice aspects of putting that research into practice.
TAI is growing in scope and in capabilities. Given that the initial launch period has passed successfully, I’ll be stepping down from the Board of Advisors. I continue to support TAI and its activities. With Lukas on the Advisory Board we will continue to have a powerful connection between TAI and Mozilla, and Lukas will have a good sense of when I can provide some particular type of support. It’s important to make room for new people. I can continue to contribute effectively in other roles, so I feel it’s time for me free up a space on the Advisory Board for someone new to step up, in line with TAI’s approach to volunteer service.
I look forward to seeing new faces at TAI, at the Advisory Board, and in open technology and culture in general. [Source: Planet Mozilla]Bonjour Mozilla: Mozilla envoie du lourd à Sud Web 
…et ce n’est pas Bonjour Mozilla qui se permet ce titre, mais les intéressés eux-mêmes (la preuve, ici). Anthony et Jérémie sont, depuis l’envoi de cette photo, arrivés à bon port, à savoir : Sud Web, l’un de nos événements préférés ! Bonjour Mozilla ne peut que vous recommander d’aller profiter de ces 2 jours dans la Ville Rose, d’autant qu’Anthony et Jérémie donneront chacun une conférence. L’un sur ce que peut vous apporter l’altruisme (dites adieu au Logiciel propriétaire, et bonjour au Logiciel Libre !), et l’autre sur MDN (évidemment !). Viendez nombreux !
Bonjour les gars !
Mozilla’s huge stars on their way to Sud Web
Anthony and Jérémie are already arrived safe and sound to Sud Web, a little while after this photo was taken. Sud Web is one of our favorite events! Unfortunately, it’s too late to buy tickets for this year, but Bonjour Mozilla recommends that you go and enjoy these two days in the Rose City next time if you didn’t had the luck to get ticket for this time. In any case, Anthony and Jérémie will each give a lecture. One on What Can Altruism Bring To You (say goodbye to Proprietary Software, and hello to Free Software!), and the other on MDN (obviously!). Come many!
Bonjour guys! [Source: Planet Mozilla]Alon Zakai: Emscripten and LLVM 3.1 LLVM 3.1 support for Emscripten just landed in master, all tests pass and all benchmarks either remain the same, or improve from 3.0.
LLVM 3.1 is now the officially supported version, all testing from now on will be on 3.1. The Emscripten tutorial has been updated to reflect that.
(3.0 might work, it does right now, but over time that might change.)
[Source: Planet Mozilla]Tarek Ziadé: zmq and gevent debugging nightmares
Note
Powerhose turns your CPU-bound tasks into I/O-bound tasks so your Python applications
are easier to scale.
I've released Powerhose 0.4 at PyPI - http://pypi.python.org/pypi/powerhose/0.4, and this
new version has a few changes that are worth speaking of.
pyzmq + gevent = ?
The biggest issue I faced with Powerhose was related to gevent. We have a powerhose set up
here at Mozilla with 175 workers and each one of them is performing crypto work.
A Powerhose worker is just a call to powerhose-worker pointing to a callable.
What I did not realize was that the module that was imported was also used by our main
application, and was calling gevent and gevent_zmq monkey patching.
gevent_zmq is a library that patches pyzmq so it interacts well with gevent. It's
going to die eventually since pyzmq is including a green module that will provide
gevent compatibility. But this module does not provide a Poller yet.
So, in other words, any project that has pyzmq & gevent will block unless you
use gevent_zmq. And if you use the Poller you need my fork: https://github.com/tarekziade/gevent-zeromq
Back to my workers. Having them patched by gevent/gevent_zmq is not an issue per se.
It can even speed up very slightly things since the workers are fetching certificates on
the web sometimes.
But at some point -- or more precisely, around every 24 hours, the workers were simply
locking themselve and hanging there.
After a lot of work, I found out that gevent had its own dns resolver which was used
when calling socket.getaddrinfo, and for some reason -- a bad interaction between zmq
and gevent I suspect, a greenlet was locked.
Maybe that's a bug in gevent_zmq, maybe that's an issue in gevent itself..
I failed to find the real reason because the lock happened in various places in the
gevent socket code. I did not spend more time on this since the bug seems to have gone
away once I removed gevent altogether from the workers as we don't use gevent there
and the workers are sync beasts.
The one thing I was able to do though was to write a little piece of code to
dump all running threads and greenlets to understand what's going on:
import traceback, sys, gc
def dump_stacks():
dump = []
# threads
threads = dict([(th.ident, th.name)
for th in threading.enumerate()])
for thread, frame in sys._current_frames().items():
dump.append('Thread 0x%x (%s)\n' % (thread, threads[thread]))
dump.append(''.join(traceback.format_stack(frame)))
dump.append('\n')
# greenlets
try:
from greenlet import greenlet
except ImportError:
return dump
# if greenlet is present, let's dump each greenlet stack
for ob in gc.get_objects():
if not isinstance(ob, greenlet):
continue
if not ob:
continue # not running anymore or not started
dump.append('Greenlet\n')
dump.append(''.join(traceback.format_stack(ob.gr_frame)))
dump.append('\n')
return dump
That should be useful in the future.
Bottom line:
gevent does not have good debugging tools - I guess the function I wrote is
useful, it can be even injected live on a running process using Pyrasite.
But gevent should provide this kind of utility imho - I'll propose something
I am looking forward for pyzmq.green with the Poll class. We've opened a ticket
on this, and it will eventually land I guess.
zmq_bind() bug ?
The other issue I had was with the ZMQ bind() API. Powerhose's Broker binds a
socket, but it turns out you can bind as many broker as you want on the
same IPC or TPC adress !
You end up with one active broker and a lot of zombies brokers...
See this bug to reproduce the issue:
https://github.com/zeromq/pyzmq/issues/209 (the past
scripts will be online for a month)
And that's the thread I started in the zmq mailing list:
http://lists.zeromq.org/pipermail/zeromq-dev/2012-May/017249.html
So until this is solved, what I did is add a health feature in Powerhose.
You can now call the broker, but instead of passing a job, you can pass
a PING and the broker will directly return its PID instead of
passing your call to a worker.
That's good enough to make sure the broker is up and running, and
the powerhose-broker command line has gained two options:
$ powerhose-broker --check
.. checks if the broker is alive and running, prints its pids...
$ powerhose-broker --purge-ghosts
.. kill any "ghost" broker...
The broker itself does a --check when it starts and exits if it finds
a running broker on the same endpoint.
This will be useful for a Nagios checker. But... zmq should just error out
when you try to bind twice.
What's next
I am wondering at this point, besides those small fixes, if Powerhose
should gain more features... Circus itself provides all the stats and
maintenances feature we need to manage powerhose workers..
Links:
Please let us know what you think !
[Source: Planet Mozilla]
More News |