2008-12-18

The best way to browser Flickr to date

This must be the way to browse Flickr lazily: tag browser. That is the tag browser I've been waiting for, basically. I would guess they are using something like DbPedia?

Novint Falcon

This thing seems really cool. Not so much as a game controller, since I barely play games, but rather as a simple CNC machine ;)

Imagine building a foam sculpter or a 3D scanner from this. Should be easy, especially the scanner since there are buttons on the attachments already, atleast if you dare modifying the official attachments. Apparently they have PICs in them, too, so there might be a real protocol between them and the falcon making building your own attachments that much more of a challenge.

2008-10-22

Android Developers Blog: Android is now Open Source

So, the andorod source was released. That is awesome, and I never doubted it would happen. On the other hand, symbian is also something I would like to see the source of, with me using it every day! I read this morning that the source would cost money to get at, which is really sad. This is what I found:

"The Symbian Foundation platform will be available to members under a royalty-free license during first half 2009. The Foundation will provide, manage and unify the platform, ultimately releasing it as open source. "

Hopefully this is just a temporary measure, as I understand releasing it in the end means I can look at it for free! We can hope anyway...

2008-08-19

Android signing signs..

I hope this is how signing will be done even in Android devices:

There are no requirements on the key used to sign .apk files; locally-generated and self-signed keys are allowed. There is no PKI, and developers will not be required to purchase certificates, or similar. For developers who use the Eclipse/ADT plugin, application signing will be largely automatic. Developers who do not use Eclipse/ADT can use the standard Java jarsigner tool to sign .apk files.

Taken from http://code.google.com/android/RELEASENOTES.html. This is the major problem with J2ME in my opinion: no way for developers to self-sign stuff.

2008-07-28

Google Knol - Information without semantics

So I saw gnol mentioned on identi.ca. I checked it out, but oh boy was I disappointed. Absolutely no semantics at all? Wikipedia is more semantic than that! Faviki is more semantic still. And freebase!

Unless they fix that I don't see how it will be succesfull in the the long term. Other "semanticer" things will simply crush it.


Official Google Blog: Encouraging people to contribute knowledge

2008-07-06

Using joins to find missing data and blobs

I just realized, after having read about column based storage, how potentially bad it is to have blobs in a metadata-rich table.

I am currently developing a small project during my sparetime, and we have a table which is roughly: id int, mimetype varchar, data blob, version int, and so on with a few more non-blob columns in it. What I noticed was that selects on this table that does not need the blobs using the indexed mimetype-field was incredibly slow. Relevant is that I am running MySQL in development mode, so it never seems to use more than a few tens of MB of RAM, meaning almost no caching is happening here. This means, in a row-storage database like MySQL I will be doing a tremendous amount of either seeking or reading of unneccesary data, depending on how MySQL plans its disk-reading, how big the blobs are and the effect of any read-ahead in OS or elsewhere.

The other part of this slow query is a outer join, something like this:

select id from c
left outer join b on b.c_id = c.id and b.name = "something"
where b.name is null

This is apparently also fairly slow. So don't do these things too much. In this case, the join to find if data is not in a joined table could be easily fixed by adding data to the db.

More funny java bugs

So, I get this simple assignment at work. Install a new version of our software on a brand spanking new Intel Xeon quad-core machine. Nice.

Only not so very nice when I notice hard java crashes. This is in EDU.oswego...ConcurrentHashMap's iterators hasNext method, always. I try downgrading to JDK1.5, but that does not help.

I might have to do yet another probing bug hunt in third-party software. Argh.


Also, trying out identi.ca even though blogging, and micro-blogging especially is not really my thing. Finding it fairly good though, just as ping.fm.

2008-06-10

HTMLUnit, MultiThreadedHttpConnectionManager and memory leaks

I have been having this wonderful time at my work. A small coding project involving use of mostly HtmlUnit was almost done, and working properly. But what happens? By chance I notice that it is leaking memory: Perm Gen space, even.

This was coded as a plugin for a larger product, and was dynamically reloaded at every invocation. I had to first remake the plugin CassLoader in this case to become a post-delegating classloader so that I could override the version of HttpClient already in the product. I couldn't really be sure it was not my changes leading to the classloader that gave rise to the leak, but eventually I got to that conclusion. The next step was to narrow down where this happened. Long story short, I found that if I changed HtmlUnit to not use the MultiThreadedHttpConnectionManager from HttpClient, it did not leak. I did not want to really do this though, being unsure of how HtmlUnit actually used this, and also because of the fact that we have multiple threads using HtmlUnit.

The thing that solved the issue was to call shutdownAll in the connection manager. I am not allowed to access that from my code as a user of HtmlUnit though, and I did want to avoid having to recompile anything, so I used reflection to subvert the access checks. Calling shutdown on the one manager did not work, however, nor did closing the connection, which HtmlUnit already did by the way.

I can only assume this is some obscure bug that nobody else ever trips, but now at least if somebody does, they might find this as a reference.

I could not use the latest HtmlUnit because of needing JDK1.4 compatibility, so this was done in HtmlUnit 1.13. Oh, and 1.14 needed CSS stuff that clashed with regular DOM libraries, making classloading not work. Not sure why this does not work when I can safely override HttpClient with a newer version.

2008-06-06

Semantic Web is freaking cool! (and on a roll it seems)

Been surfing around for semantic web websites to find ontologies or datasources to (ab)use for Yet Another AI-Project From Me. This is what I found:


  • True Knowledge. Incredibly cool question-answering frontend to an incredibly complex datamodel, with a moderately complex and severely boring input process. Not free data, I can not download a dump of their database.

  • Freebase. Took me some time to dig into this, actually, but I like what I am seeing. Data model seems a lot simpler than True Knowledge, or at least that is what i think (subclassing, transitivity missing?). Inputting stuff is from 2-10 times quicker/easier. For bulk stuff it is infinitely easier, since TK does not support that at all. Free data, but not RDF!

  • Faviki. Very nice and easy to use semantic social tagging/bookmarking service.

  • RDFScape. Visualizer for cytoscape. Very nice, have not had time to play with this yet.

  • Attempto Controlled English. Maybe the least exciting of the bunch, but is useful for my NLP-related project.




I also got access to twine. Oh my god what a bore. I just did not see the idea behind it, and the interface turned me off so much that after my third visit I never came back.

True Knowledge has some awesome NLP parsing going on, but it also fails miserably often. I have a simpe idea to get me atleast started, it pretty much builds upon AIML/patterns to extract meaning from stuff, specifically Wikipedia.

Freebase has a weak model in my mind, there does not seem to be a real inheritance hierarchy and the "upper ontology" is basically missing. The upper ontology not being there is not such a big deal though, I think. There should be a set of "uppermost" classes in Freebase that can be mapped to SUMO/YAGO/DBPedia/Wordnet or whatever to help with any inferencing/analogous thinking.

I can not help but think that in five years from now, "semantic" does not really exist. Everything is then semantic, or gone since long. AGI is not far behind, either. I predict a surge of NLP success in the coming few years, mainly with knowledge-intensive approaches. Common-sense is still the missing piece of the puzzle, the above efforts do not concentrate on this at all, but rather on knowledge that is useful to humans. Remember, common-sense is boring for humans to input and administer, since it is all so basic.

2008-04-28

Nokia E51 stability

I love my phone for several reasons. It is just the right size, fits very nicely in jeans-pockets without wearing them down in no time. My Motorola E398 was much worse in this regard due to being thicker.

I also love how it handles most J2ME apps, has email and so on. I've started using it during my commute every day to surf on my laptop, via bluetooth mostly since USB-cables are not that much fun carrying around. The phone is mostly stable. I've encountered crashes when for example running many apps and playing mp3s at the same time. Nicely enough it reboots automatically most of the time. I think I've found what it does not handle so well though: bluetooth internet sharing. Firstly it gets very hot, thats fine though I guess. But it also seems to become very unstable if I try to actually use it during this time.

Too bad, but I'm not surprised. I think that if I just don't touch anything while using it as a mode, it copes better.

2008-01-29

"not in" versus joins

I have been fighting this HQL query for probably 8 hours in total now, where I have a table Entity and EntityAttribute. Every entity has zero or more attributes, so the EntityAttribute table has Entity_ID row in it. Attributes have a name and value, each in its own column. EntityAttribute is mapped as a map collection from Entity, with the "name" as the index. I want to select entity based on whether they do or do not have an attribute with a particular name and value, although I know in practice that those who attributes with the right name do have the rigth value always, at the moment anyway.

The end of the HQL looks something like this:


SELET DISTINCT e FROM entity
WHERE ...
... AND
'type' in indices(e.attributes)


'attributes ' is the collection of attributes.

I would have guessed this to work, but no. I eventually tried this in plain SQL, where it of course also does not work.

I also tried with this, for the other way around (exclusion):


SELET DISTINCT e FROM entity
WHERE ...
... AND
e.attributes['type'] != 'animation'


I can sort of understand this last construct being wrong. The correct way to do things is apparently to swap these two, basically use "not in" for exclusion and use "join" to include things, which is what e.attributes['type'] = 'animation' uses.