Adventures in New Tech for an Old Coder: 2011

Friday, September 23, 2011

Again: exports and module.exports in JavaScript Modules

I reread my posts and realized that I haven't been clear as I'd like.

If you require() a module there are a couple simple rules:

you don't need to put the '.js' on the end of the file.
if you require('./my_module') the file must be in
1. ./my_module.js or
2. ./my_module/index.js
if you require('my_module') the file must be in
1. ./node_modules/my_module.js or
2. ./node_modules/my_module/index.js or
3. or a node_modules directory toward root (.. or ../.. or ../../.. till /node_modules)

As for the module file itself, you must set the object to be returned from the require() call. This is the part that flummoxed me.

It is simple, in your module there is a module variable automagically. That module variable points to an object that is pre-populated with many properties. Just do a console.dir(module) in your empty module file to see it all.

The object that gets returned from the require() call is pre-allocated in the module.exports property. There is also a variable called exports that contains the very same object that is in module.exports. If you assign a new object to the exports variable, eg exports = {};, that new object will NOT be exported as it is not the same object that is in module.exports. Think of it as if there is a invisible exports = module.exports = {}; line at the beginning of your module. The exports variable is only there for your convenience and only the object pointed at by module.exports matters.

This is simple and I found no one who said it straight out. There was crap about exports being an "alias" for module.exports and other balderdash. If you want to create your own elaborate exported object just do this:

exports = module.exports = {
 ... my big-ass exported object ...
};

Thursday, September 22, 2011

CommonJS exports spec ambiguity

I have just been reading the spec for CommonJS. In the 1.0 Spec it refers to the exports variable and object interchangeably.

In a module, there is a free variable called "exports", that is an object that the module may add its API to as it executes.

modules must use the "exports" object as the only means of exporting.

There is no mention of "module" or "module.exports".

I've also realized I am a blithering idiot. I put console.dir(module); in an empty module and I saw that it was populated with a great many things. Most pertinently, module has a property named exports already initialized to an empty object. If you add something to module.exports like module.exports.foo = 'bar'; then console.dir(exports); will output { foo : 'bar' }. It is sad how much my coding skills have atrophied the last few years. I used to be rather sharp ... oh well.

Tuesday, September 20, 2011

Nodejs: module.exports and exports

I have just had a rough time learning how to create my own module. The problem is that I can't move on just knowing How to get something to work, I need to know Why. It is some form of programmer paralysis; worse than writers block. I needed to know how/why exports worked in Nodejs modules.

Requiring modules is simple

var m = require('./my_module');
m.myFunc(foo, bar);

The file needs to be "./my_module.js", with the following exports line:

var myFunc = function () { ... };
module.exports.myFunc = myFunc;

The Why is the hard part. You can't do exports = {}, but you can do module.exports = {}. Turns out module is the "global" namespace in an require() file. exports is some sort of "alias" for module.exports (I haven't figured that out). What this means is that exports.myFunc = myFunc; is OK, but exports = {myFunc: myFunc} is NOT OK, BUT module.exports = {myFunc: myFunc} is OK. Basically, I've decided not to use this pseudo-whatchamacalit "alias" type thing exports. I'll just use module.exports. Additionally, it is probably best to use the form:

module.exports.myFunc = function () {
...
};

for all exported functions inside ./my_module.js

Monday, August 22, 2011

Testing GitHub's embeded gist

Bugger! No line numbers. I'll look into that.

Tuesday, August 16, 2011

Syncronous coding style with EventMachine and Fibers

I did this a while ago. I am not really happy with it. As I've said before. Event driven code doesn't bother me. Here is an example.

require 'fiber'
require 'eventmachine'

class AsyncIO < EM::Connection
  include EM::P::LineText2
  def initialize df
    @df = df
  end
  def receive_line line
    @df.succeed line
  end
end

def get_line
  f = Fiber.current
  df = EM::DefaultDeferrable.new
  df.callback { |line| f.resume(line) }
  EM.connect '127.0.0.1', 3000, AsyncIO, df
  return Fiber.yield
end

EM.run {
  Fiber.new {
    line = get_line
    puts line
    EM.stop
  }.resume
}

The truth is you should look at EM::Sychrony. Ilya Grigorik explains in he own Blog, how Fibers can help you write synchronous code while keeping the goodness of AsyncIO (ala EventMachine).

I don't find event driven code to be "callback hell" like Ilya Grigorik does. I find evented code easier to read than synchronous code sometimes. In Synchronous code you have to check every time if it failed and why. That is

if (rc < 0)

with

switch (errno) case EINTR:

code after every network call. In event driven code that stuff is dealt with in the event-loop and you just have to define the error callback once.

Ah well just use Node.js . If JavaScript event driven programming is "to hard" for your delicate sensibilities then explain to me how Millions of Web-monkeys pull it off everyday in the browser.

Friday, July 29, 2011

Morans! I am surrounded by Morans!

I posted a response on a blog recently. The fellow claimed, quite explicitly, that "Fibers and EventMachine were a response to Ruby's poor Threading performance". I responded with Fibers and EventMachine were not made in "response" to Ruby's Thread implementation. I am strangely polite in online discussions, where in person I'd have ripped this guy a new asshole.

I explained that Threads exist because n = read(fd, buf, len) blocks. Threads allow execute work to continue in parallel to blocked IO. That he had it backwards, event driven IO, like EventMachine, exist so a user-space program can continue to execute work while the IO finishes in the Kernel. While user-space programmers don't explicitly invoke the switching of one thread to another, context switching is not costless. In fact the cost of context switching is many hundreds of CPU instructions.

Relatedly, Fibers are cheap where Threads are expensive. Fibers switch into a parallel call stack, much like Threads, with the equivalent of a register switch and a long jump. Additionally, the ability to return results from a Fiber while preserving the Fiber's call stack state allow for remarkable new behaviors to be constructed that can not be done with out them.

It seems these Ruby coders one of two types: One, HTML monkeys that have learned how to code, or Two Java refugees that got tired of Bondage-and-Discipline style of programming. Both of these kind of MORANS, don't understand that Ruby is a remarkable Programming language with features not available in other languages. Neither Perl nor Java, have the concept that every variable is an object. Nor do either language have Fibers/coroutines. These features change the very nature of the code you write.

P.S.
I plan to post some code to show synchronous read/write IO calls while doing all the actual IO inside EventMachine. So you get the best of both worlds ala synchronous IO with out blocking. Stay tuned; same Bat Time same Bat Channel...

Saturday, July 23, 2011

Git and RSpec

I just watched a talk by Linus Torvalds at Google about Git. One of the questions touched on one of Linus' motivations for Git. That is Merging. I want to talk about how Git merging dove-tails nicely with Ruby's testing tool RSpec.

Some source control managers (SCMs) make a big deal about "branching" being cheap. Linus points out that branching is not the problem, merging is the problem people deal with. Git makes merging easy by removing the number of conflicts the developer has to deal with. It does this two ways. First is algorithmic and the second is how Git changes your workflow.

Git allows for, and relies on, three way merging. CVS/SVN by contrast only does two way. Three way merging diffs the original file and each of the two conflicting versions of that file. Git can do so because it stores the whole files, not just a long string of diffs (ala CVS/SVN which I'll just call SVN from now on). Three way merges allow the merge algorithm to look at the context of the text being diffed. SVN can only look at line 245 and see it changed. Git can notice that the line didn't change, it mearly moved. Say you and the conflicting file inserted a text before the same line 245. So there is no conflict just code motion in the file. Three way diffs see this, two way diffs can't.

The second way Git deals with merge conflicts is not as obvious; it stems from the work flow. Specifically, branching is inherent in Git's design and merging is easier and better. So you branch often and merge often. More to the point you start your branch from the central "master" branch for your feature add; change & commit a bunch of times to that small branch; then pull or push your change back into "master". By repeating this branch-change-merge-push cycle early and often you have less opportunity for merge conflicts. Put another way, your branch exists for less time and syncs with the central master more often, resulting in less opportunity for two people to introduce genuine conflicts.

So alot of talk is generated about Git working in a fully distributed way. However, that ignores that the Git, encourages syncing up those distributed merges early and often.

So imagine you replace SVN with Git. First you don't have to do it to your whole tree at once. You can do it by subsystem. Also there is Git-SVN gateway tools to help do this. You can still have the central tree ala SVN, but you just branch off it and merge back more often. In SVN you are usually doing this anyways. Each SVN commits end up being after the feature is completed; much like the final merge of a Git branch, but without all the Git commits in the mean time. Or you create a longer lived development branch and merge that back as a monster merge.

I've explored other SVN branching/merging workflows but all they do push the pain to a different part of the process. A smart colleague of mine called it "squeezing the balloon". What you need to do is keep the master branch stable. That can be achieved by branching and merging smaller and faster. The SCM (or VCS if you prefer) makes all the difference.

What also helps is having a test suite you can rely on. First the test suite has to be aimed at the internal API level. Second, it has to be easy and fast to run. This is where the choice of Test framework dove tails with the SCM choice. "Early and Often" is the catch phrase for both.

Testing the API's "contract" is where you should aim your testing framework. As a side note: you are forced to state what that "contract" is which is a level of internal documentation that often gets forgotten. The contract is what each function takes as input and what the results should be; especially the edge cases. Edge cases are given a function "sum" adds all the elements of an array, what if the function is given an empty array or a nil as input. So that is the trivial case. Then there is the out-of-bounds case, like each element of an IP address (dot-quad) is between 0(inclusive) and 256(exclusive). Correctness should be tested if you can. Most of the time, correctness can't be tested with out recreating the logic of the function or testing pre-canned input and results. But that leads of who-tests-the-tester (my favorite quote: "Qui custodiet ipsos custodes?")

Another variation of this "contract" is how it applied to methods of a object. Methods have input and return results, but they may alter the objects state. Input and result have to be tested as above, edge case and out-of-bounds case. But internal state? One internal state to test is internal consistency. Again there are edge cases of initialized(trivial states) and known inconsistent states.

RSpec is a good ruby oriented tool-set built explicitly to test the contracts in your APIs. What are other API test frameworks for other languages is left as an exercise for the reader :)

Thursday, July 14, 2011

Little Idiom of Ruby I like

h = Hash.new { |h,k| h[k] = [] }
h['foo'] << "a"
h['bar'].push "b"
puts h.inspect

outputs {"foo"=>["a"], "bar"=>["b"]}. In other words, we declare a hash with a constructor that sets each new key to have a value of a empty array. Otherwise, we would have to test each hash key to see if it was already initialized to an array, if not initialized we'd set that hash key's value to an empty array. It's a common thing to do, but Ruby makes it easy and automagic.

I suppose in perl you can rely on auto vivification.

push @{$h{'foo'}}, "a";

But as you can imagine the Ruby idiom can be generalized to more complicated initializations.

TextMate clone for winbloze

For anyone who may care there is a TextMate act-a-like called E Text Editor. I supports TextMate bundles and key strokes. I haven't used it. But I imagine it must be manna from heaven if you are a TextMate user banished to the wilds of Microsoft-land.

Wednesday, July 13, 2011

iTerm2 Does Not Suck

Ringing endorsement eh?

Well I've been using iTerm2 for a while instead of the default Terminal. Don't confuse iTerm2 with iTerm. While, I am confused about that, I haven't spent anytime figuring out the diff.

The first problem with iTerm was the colors. It has a very good color palate (MacOSX builtin I presume) and a option in the Pref panel that will send you off to a page of Preset Palets. I was happy to find one that approximated the MacOSX Terminal colors I had come to like. With that color pallet installed I moved over to my next issue.

AntiqueWhite background w/ black foreground. On a PC running Linux AntiqueWhite out of rgb.txt was fine. I entered the same RGB codes for AntiqueWhite into Terminal Prefs and was happy. Entered the same codes into iTerm2 Prefs and not so easy to find. But selected the Emacs pallet and AntiqueWhite2 seemed to give me the happys.

I am transitioning from Emacs, screen. Even my screen command control character is C-o not the default C-a, because, duh, C-a is beginning of line in Emacs. But screen doesn't do top-bottom splits. After a friend suggested iTerm2 I gave it a look then went back to Terminal/screen.

The big change came when I was writing very wide log lines to the terminal. I needed a terminal with full screen width. Given screen wouldn't give me top-bottom splits, I tried iTerm2 again. The commands were easy to learn ⌘-t for new tabs; and ⌘-[ and ⌘-] to go back/forth between panes; and ⌘-⇧-[ and ⌘-⇧-] to got back/forth between tabs.

The scrolling is easy: just mouse scroll up/down in the window; no worries about the lack of scroll bar.

All it all, I've definitely moved from emacs/screen to TextMate/iTerm2 and I think I am more productive because of all the little things I gain. I just feel like all the key strokes programmed into the nerves of my hands are being wasted. I don't even remember alot of the emacs commands. I just pull up emacs, do the command, and look at what my hands do. Then I can say "oh yeah. That's C-space hilite region C-x r t blah blah blah.

NOTE: the unicode characters above, ⌘ and ⇧, may not display correctly on non-Mac OS's. I found them on this page

Tuesday, June 28, 2011

Bye bye Emacs, Helloooo Sexy (TextMate)

If programming editors are a religion I am an Apostate.

While reading up on Ruby, I came across videos of people writing code real time. First, it is cool Ruby can do high level things in short programs. So short you can watch people coding real time. Second, more apropos, several different presenters were using this cool editor I'd never seen. I thought it was Vim with some crazy programming mode turned on. It completed statements and other coding constructs, speeding up the process of writing code.

This new editor was TextMate. It is a MacOSX app only. It has some simple bourne shell-like extension language. Or you can use any language you like. They come in "Bundles". Bundles exist for every language under the sun. Further Bundles exist for Version control systems. Checkin, checkout, update, branch, whatever. With windows to prompt for commit messages and the like.

The Git bundle allows you to switch branches real time. It even switches the files you have open in the editor. It has full colorization display windows for diffs and commit logs.

I have only scratch the surface of TextMate. You can start using it without having to buy into some IDE monstrosity like Eclipse. From what I can tell it rivals Eclipse or other IDEs for full Web Application development. Apparently it can run a full Web Application stack and support all the languages (Ruby/Python/Perl, HTML, CSS, and Javascript) side by side.

I haven't used it for so involved a dev environment. But I like how it allows you to start small and simple, then build up to the full stack development environment. From what I can tell this applied equally to (Configure, make, C/C++) or (Ant/Maven, Java, SWT) or (Rake, Ruby on Rails, Web) Application stacks.

I am still just playing with simple Project editing. My projects tend to be libraries and Server-side stuff anyways.

Tuesday, June 21, 2011

I took a look at perl5...

and found Modern::Perl. I always put use strict; use warnings; and such things in my code. use Modern::Perl; does more and is succinct. Apparently use 5.10; turns on say and the given ... when switch extension. There is a book about modern styled perl that is worthwhile for even experienced Perl programmers. It is the impetus of the Modern::Perl module. You can by the book or download an electronic version here.

I've always liked Perl. I found the blather about "line noise" and classes not compelling. Given the number of newbies using all this perl code, it can't be that impossible to read and write (though newbie code usually blows in any language). Classes are hard/complicated? The quantity and quality of CPAN code begs to differ.

I always thought that a switch was one of the features perl needed. An automatic line separator after each print statement is good. That it is short is even better. Two things come to to mind. One, Perl5 is alive and well. Second, what is left for Perl6?

To the second point, I answer, Grammers. I have looked at Grammers with wonder. Grammers provide a method to parse a full (programming) language. The key element that is hinted at is that each level of parsing provides a hook to attach code. So if people, say cpan authors, write a grammer for parsing C code and publishes it; then you can write code to use those hooks. The code is something like C::Grammar.parse($str, :$actions);.

If someone implemented Grammers in Perl5 there maybe no need for perl6.

'Nuff said.

Friday, June 10, 2011

Just compiled rakudo perl6. Perl6 is UGLY!

Simple things like getting the last element of an array have gotten grotesque. Perl5: $arr[-1]. Perl6: @arr[*-1] ughhh! I don't mind the '@' sigil; that change makes sense. But '*' in the index *-1 doesn't make sense on the face of it.

Why such a change? I didn't look deeper. It is such a non-intuitive change to such a common idiom.

Another seemingly unnecessary change: inline comments. The code goes $x = 1 #`(add one) + 1;. Why would I want to do that? In the 2+ decades I have been coding Perl professionally, I have never felt the need for inline comments. How does that help? How doesn't that obfusicate code?

Perl has a bad rep for being indistinguishable from line noise (or a cat walking on your keyboard). These two changes jumped out at me. They messed with the simple idiom for accessing the last n'th element of an array and added an unnecessary "inline comment" syntax that looks like more gobbleygook.

Supposedly, part of the design goals of perl6 was to make common things easier. So we have $obj.method() instead of $obj->method(); ok fine. And we have '$' for scalars always; '@' for arrays; and '%' for hashes to denote type rather than context. But what is that star '*' doing in the index of the array.

I was looking forward to Grammers as a very powerful tool. Talk about a text munging chainsaw! But some of these other changes are making perl6 very un-perl. Or rather, making perl6 indulge in the worst parts of old perl.

I haven't been paying much attention to perl since 5.8.1 . I hope someone has backported some Grammer-like construct it that is the case, then screw perl6!

Tuesday, June 7, 2011

Still dealing with the old problem of Logging

Well it is two issues:

Finding the basename of the program or module your code is within.
Setting up the logger given the answer to the previous issue.

NOTE: I am playing with some CSS to display code. I am using the simplest which is applying a class to a div.

Getting the Fully Qualified Directory Name (FQDN) of the project directory; assuming the program is in a project/{bin,sbin,script} directory.

require 'pathname'
BASE_DIR = Pathname(__FILE__).dirname.expand_path(Dir.pwd).parent
puts "BASE_DIR=" + BASE_DIR.to_s
LIB_DIR = BASE_DIR + 'lib'
puts "LIB_DIR=" + LIB_DIR.to_s
$:.unshift LIB_DIR.to_s

Set up the logger format. This doesn't grab the library name or line number.

log = Logger.new(STDOUT)
log.level = Logger::DEBUG
log.progname = File.basename(__FILE__)
log.datetime_format = "%Y-%m-%d %H:%M:%S"
log.formatter = proc { |sev, dt, prog, msg|
"[#{dt.strftime("%Y-%m-%d %H:%M:%S")}] #{prog}(#{Process.pid}) #{"%-5s"%sev}: #{msg}\n"
}

I am thinking of adding more data into each line. Like PID, line number, class/module name, and function. It is alot but if you are using the logs for debugging more the merrier and while I prefer maintaining the sanctity of 80 columns for code. Ultra wide log lines is not sacrilege.

Also, there are two more questions to be answered: do I repeat this code in every file? how do I abstract it into a library?

Saturday, May 28, 2011

Typing Alt text on a Mac

I don't know how this works on a PC. But I found how to type alternative characters on a Mac. I kept reading about Ømq, aka ZeroMQ. I did a View Source on my browser cuz I thought Ø was some sort of HTML entity like & . Now I know you just type option-shift-o. This all started cuz a friend typed a tempature as 72º and I thought it was an o in superscript tags (it is option-0). Now I plan on using it all the time cuz the Web is UTF8 not the ASCII I am used to. „´‰ˇÁ¨ˆ. Oooh just discovered  neat.

P.S.
Oh here is Table of many the Characters Mac & PC

P.S. P.S.
And more with links to PDFs listing this stuff

Git ROCKS!

I can totally see using Git just for my personal use; much less all the fancy distributed version control stuff. I already can see me throwing up a Git fork of a Ruby EventMachine lib for using Ømq. There was a method for sending a multipart message as one single call to zmg.send_msg(). I added another call that allowed multiple calls, to zmq.send_more(), for a multipart message before completing the message with zmq.send_msg(). I could put up my simple extension on github.com; send a note to the author; and let him use it or not.

Thursday, May 26, 2011

I am learning Git now

I am reading an online book Pro Git.

It is pretty cool. I can see the Kernel Hacker mentality in it's design. For example, files are stored in .git/objects/ as files with the name is the SHA1 hash. The name of the file exists in a directory like file. This is like files on filesystems stored by iNode number and the name of a file only exists in the Directory file. This makes renames fast and hard links possible in filesystems. It has a similar usefulness in Version Control.

There is a lot more to learn about Git than, for instance, learning how to use Subversion. That is due to the fact that Git does so much more. Git runs completely locally. Switching branches is fast, like renaming a directory; as compared to recursively copying a directory.

I am only up to branching right now. I can definitely see there is coolness behind all the hype around Git. It is not just the Linus Torvalds(tm) brand name. Even for private projects, it looks like it will be useful.

Sunday, May 15, 2011

Things I got right but didn't describe in such detail.

I've read a couple things or three lately that justified some of my strongly held intuitions in the past.

Javascript is really like LISP (and tcl as well).
Event Driven code is not hard; stop complaining you whiner!
Simulating non-blockingness just papers over how dramatic IO is.
Code correctness and Resource contract testing is the Right Thing(tm).

1 & 2 are dealt with in Crockford's Javascript lectures.

3 is sorta dealt with in the Crockford lectures, but I want to write an example of papering over read/write in Ruby/EventMachine.

And 4 was justified in the long winded talk by J.B. Rainsberger Integration Test are a Scam.

1 is an observation that Javascript is a Function oriented language and so is LISP. You could probably map Javascript into some LISP-like derivative easily.

2 is a simple argument to make. Browser programming in Javascript is an EventLoop; look at all the total newbies that code Javascript in the Browser.

3 is comes down to the Marx quote "A sufficient change in quantity is a change in kind". IO takes many orders of magnitude more time to execute than all the code that looks like z = x + y or even a1 = sort(a2). If it is so different in time to execute, then it shouldn't be represented as just-one-more-line-of-code.

4 boils down to two parts. One, your basic APIs do what is expected. And Two, your components implement mutually agreed upon contracts. One and Two are easy to test, and they expose the real source of the remaining bugs: Design flaws.

Update:
Here is a table I found.

I/O	Cycles	Order
L1	3	10⁰
L2	14	10¹
RAM	250	10²
Disk	41,000,000	10⁷
Network	240,000,000	10⁸

Thursday, May 12, 2011

I have been experimenting with ZeroMQ

It is a good library in several ways:

It generates regular sockets and uses the unixy socket APIs.
It handles connections automatically. What you really do is announce your desire to have a connection with ctx.connect(type, link, handler). If there is no accepting socket at that location(link) then it will continue to try to establish a connection in the eventloop. Hence, it doesn't matter whether the server or client starts first.
The link description contains three items in a URL style string. link = proto://address:port ie "tcp://localhost:3000". Protocols include 'tcp', 'udp', 'inproc', 'ipc', 'pgm' and 'epgm'. 'tcp' and 'udp' are obvious. 'inproc' is an internal to a process memory space pseudo-socket (for thread interop). 'ipc' is Unix file sockets ala /tmp/mysql.sock. And the 'pgm' types are named for a library that does multicast sockets.
Then there was framing. All messages are sent a arbitrarily sized opaque blobs. Each message on the wire is a length and a stream of bytes. sk.send("foo") sends 3 bytes (no null term unless you intentionally send that). There is a small addendum to this method, in that it can send many parts inside a single "message", there is a SENDMORE flag with each lowlevel send() call and no SENDMORE flag on the last send() call. send()
Lastly there is a type argument of the ctx.connect() call. This sets the "Messaging Pattern" used on the socket. The messageing patterns determine where and how many copies of the message are sent, which messages are recieved versus ignored, and Queueing policy. This is the area I need to delve into deeper. But some patterns are intuitively usable; for instance REQ/REP, PUB/SUB (subsciption topics are trival to the point of lame).

It was created as the simple alternativ to AMQP.

Wednesday, May 11, 2011

I have added Node.js to my Adventures

I have been exploring Node.js. Node.js is intrinsically event driven. I love that it is event driven only. There is no blocking; neither I/O blocking nor even sleep() blocking. The only way to simulate blocking is to use while (true) {}. There is also non-blocking MySQL and PostgreSQL clients.

I have had a series of Jihads concerning Computer languages over the decade. "Threads are Evil" is one such jihad. Of course that is hyperbole. Threads have a good and honorable place it the world o' computing. My beef is that they are used to compensate for blocking I/O and the usual programmer inertia about new things. Additionally, threads have several well known deficiencies. Another is the need for unsigned integers in Java.

Node.js is exceptionally fast compared to other languages. It is based on the Google V8 JavaScript interpreter.

I've liked JavaScript since I had to create a dynamic metrics graph display tool. Now I've found a series of talks by Douglas Crockford that delve into the goodness in JavaScript. The Crockford series were very illuminating for any computer programmer generally, regardless of the Javascript focus. I then realized that Douglas Crockford was the author of an Oreilly book I had purchased recently: Javascript: The good parts.

Thursday, April 14, 2011

Project Idea

I want to (re)create a project I did in my time at AOL. It was a data pipeline for collecting system stats. The pipeline is an Agent per host to collect the stats, a Message Routing infrastructure, and a data storage end point. Currently my Idea is to do all this in Ruby, 0MQ, and MySQL. Maybe later I will keep the timeseries data stored in something other than MySQL.

Mostly I want to play with 0MQ.

Sunday, April 3, 2011

Detecting remote connection closed

Normally you see read(2) return 0 bytes and errno set to ECONNRESET (on MacOS X) to detect that the peer of the socket closed the socket. But I have not figured out how EventMachine notifies the user of a peer disconnect.

EM (EventMachine) calls unbind on a socket close. But that is for a client close via close_connection or a peer close with no means to detect the difference.

Saturday, April 2, 2011

Simple Chat Server

Uses EM::Channel to communicate between two Server ports.

A simple `nc localhost 7000` to talk to the server

Learned a bunch of little things. More about transliterating things from Perl to Ruby.

Spent too much time trying to get /^(\w+)(?:\s+(\w+))*/ to do what I thought it would do. Doesn't even do what I think in perl. Of course the obvious str.split(' ') didn't work till I "gave up" and thought of a programatic way to do that hypothetical regex.

Then "chan_sid = ..." didn't work, but "self.chan_sid = ..." did. I have to figure that out.

#!/usr/bin/env ruby -w

require 'rubygems'
require 'eventmachine'

class CmdExecutor < EM::Connection
  include EM::P::LineText2
  attr_reader   :port
  attr_reader   :cmd_prompt
  attr_reader   :rsp_prefix
  attr_reader   :greetings
  attr_reader   :chan
  attr_accessor :chan_sid
  def initialize port, chan, *args
    p port, chan, args
    @port = port
    @chan = chan
    @cmd_prompt = "[#{port}]> "
    @rsp_prefix = "[#{port}]# "
    @greetings = "Hello how may I be of assistance?"
    console "Setting port=#{@port}"
    console "Setting prompt=#{@cmd_prompt}"
    console "#{self.class} initialized"
  end
  def console *args
    puts "[#{port}] #{args.join(' ')}"
  end
  def send_line line
    send_data(line+"\n")
  end
  def post_init
    console "\"post_init\" for port=#{port}"
    self.chan_sid = chan.subscribe { |msg| receive_chan msg }
    console "channel sid = ##{chan_sid}"
    send_line greetings
    send_data cmd_prompt
  end
  def receive_chan msg
    (port, cmd, *args) = *msg;
    console "FROM CHANNEL(#{chan_sid}): #{cmd} #{args.join ' '}"
    send_line rsp_prefix + "#{cmd} #{args.join ' '}"
  end
  def receive_line line
    (cmd, *args) = line.split(' ')
    cmd = cmd.downcase
    console "RECIEVED CMD: #{cmd} #{args.join(' ')}"
    case cmd
    when "close"
      console "\"close\" command called"
      console "close_connection will be issued"
      send_line rsp_prefix + cmd
      close_connection_after_writing
    when "quit"
      console "\"quit\" command called"
      console "EM.stop will be issued"
      send_line rsp_prefix + cmd
      EM.next_tick { EM.stop }
    else
      chan.push([port, cmd, *args])
      console "sent to chan (#{chan_sid}) \"#{cmd} #{args.join ' '}\""
      send_data cmd_prompt
    end #case data
  end #end def recieve_data
  def unbind
    console "connection closed"
    chan.unsubscribe(chan_sid)
  end
  #private
end


EM.run {
  chan = EM::Channel.new
  EM.start_server("0.0.0.0", 7000, CmdExecutor, 7000, chan)
  EM.start_server("0.0.0.0", 7001, CmdExecutor, 7001, chan)
}

Friday, January 28, 2011

EventMachine echo server

Edit: I am trying to find the public interface for event servers and connections.

We need the require 'rubygems' so that the interpreter will know to search the rubygems install paths. Apparently, module locations are not hard coded into the interpreter.

#!/usr/bin/env ruby                                                              
require 'rubygems'
require 'eventmachine'

port = ARGV[0].to_i
puts "PORT>>>#{port}<<<"

module Echo
  def receive_data(data)
    puts "RECV>>>#{data.chomp}<<<"
    send_data(data)
    puts "SEND>>>#{data.chomp}<<<"
    if data.downcase.chomp == 'quit'
      EM.next_tick { EM.stop }
    end
  end
end

EM.run do
  EM.start_server("0.0.0.0", port, Echo)
end

Single line socket puts

#!/usr/bin/env ruby
require 'socket'

send_msg = ARGV[1]
send_msg ||= "[nil msg]"
puts "MSG  >>>#{send_msg}<<<"
port = ARGV[0].to_i
puts "PORT >>>#{port}<<<"

TCPSocket.open("localhost", port) do |sk|
  sk.puts send_msg
  puts "SENT >>>#{send_msg}<<<"

  recv_msg = sk.gets
  puts "RECV >>>#{recv_msg.chomp}<<<"
end

Echo Socket (single thread, single connection)

#!/usr/bin/env ruby
require 'socket'
port = (ARGV[0] || 7777).to_i

TCPServer.open('localhost', port) do |svr|
  loop do #do-while loop
    sk = svr.accept

    input = sk.gets
    puts "RECV >>>#{input.chomp}<<<"
    sk.puts input
    puts "SENT >>>#{input.chomp}<<<"
    sk.close
    break if input.downcase.chomp == 'quit'
  end #end do-while loop
end

Documentation

I had to learn everything from the web. Bulky books are a problem for me.

Almost to low level API docs
http://www.ruby-doc.org/core-1.8.7/

Good "Programming Ruby" book
http://www.ruby-doc.org/docs/ProgrammingRuby/

EventMachine

Very Intro PDF
http://everburning.com/wp-content/uploads/2009/02/eventmachine_presentation.pdf

Better Intro PDF
http://everburning.com/wp-content/uploads/2009/02/eventmachine_introduction_10.pdf

Very Basic Tutorial
http://20bits.com/articles/an-eventmachine-tutorial/

Low Level API
http://eventmachine.rubyforge.org/