Posts

Showing posts from 2014

Upgrade your site checklist

You already have a site - now you want to upgrade it, what should you do ?
Make it fast - preferably not having any server side code (no php etc) just plain htmlUse Xenu to see that there are no broken linksRedirect your site to the www domainCreate a 404 page (or redirect to your homepage)Make it responsive!
Reduce the size of the site componentsCompress images (many online services)Dust off unused CSS selectorsMinify CSSMinify JSMinify HtmlInstead of minifying the above resources (css, js, html) use Google's minifier which minifies all of the above on the fly (also uses cache) and even joins the CSS & JS together so you will have less server calls.Use GTMetrix to analyze then implement their suggestionsUse Google Page Speed to analyze then implement their suggestionsDon't stop implementing speed fixes according to the above analyses till you get above 90% speed score!
Add Analytics (google analytics)Add Google webmaster toolsCreate Sitemap(and submit it into google's w…

How to analyze text in Java

I have a site with a form.
Users use that form and send me requests, so I naively thought to automate the process of reviewing those requests by analyzing the text - makes sense no ?

Well, apparently I opened a Pandora box called NLP.
It seems that text analyzing is a very vast subject with different algorithms of doing so.
In order to have some order out of the chaos I want to separate the subject to several sub-subjects: Sentence isolation - breaking the paragraph to sentences [not everyone is using "period"]Naming - identifying names, places, dates, currency etc.POS-TAGging - finding the type of each word in the sentence (Noun, Verb etc)Parsing - Identifying sentence parts like subject, direct object etc There are many more parts and sub parts but the above are those I decided to focus on Please note that in order to be accurate the tools need a big "dictionary" of the parsed language, thus these tools might be very heavy on megabytes (some are several hundred M…

Crawling a site with Java

I wanted to crawl a site using Java.
I didn't want to invent the wheel so I wanted to find a good java library which crawls sites.
On top of my head I know of several libraries, so I thought of checking them first. I began with Solr & Lucene and continued with Nutch.
After doing some reading I understood that the above libraries are out of my scope and although they will do the job, they are an absolute overkill.
Why ?
Well, I guess I need to begin with my actual requirements Before the requirement I suggest reading the following post, just to understand the basics (I won't use that actual code, but it does help to understand the concept and to sharpen the requirements):
How to build a crawler in Java


Requirements Crawls a full siteMinimum amount of logic so it won't go to the same URL more than once etcMultithreadingGood API so it will be easy to work with, I need a simple API to define the number of threads, site root, filter which pages to pull according to their URL…

2014 July, Eclipse Development Survey Analyzed

Eclipse community have just released their yearly survey (about 900 participants).
I see this survey as a good source of data to feel the current development vibe.

Here is the Eclipse Survey

So, I looked at the survey results several times and decided to write my specific conclusions extracted from the survey. I have analyzed the survey and want to only write about the big changes I saw which are worthy of my attention (and probably yours too if you read my post).
My Personal Survey Conclusions Open source has lost some of it's prestige (a pity)  [Slide 9]Open Hardware and Internet of Things are here to stay [Slide 13, 17]Most Eclipse users, upgrade their IDE soon after each yearly release [Slide 15]Javascript is here to stay [Slide 21]Application Servers: Tomcat remains first, but JBoss is gaining [Slide 23]Repositories: CVS is finally dead, SVN is dying fast and GIT is gaining fast [Slide 24]Build Tools: Ant is dying fast, Maven is leading big time, Gradle is gaining fast [Slide …

First Step with GWT

Why GWT?
Compiles to JavaScript and uses Ajax to get only the needed fragments thus cutting the network loadDeveloper uses JavaDeveloper uses mature IDE toolsGWT compiler optimizes the Javascript

This post will function as a quick reference on installation and main resources for beginning with GWT.
Installation Install Java JDK (Set JAVA_HOME to the root and add the "bin" directory to the Path)Install Ant (Set ANT_HOME to the root and add the "bin" directory to the Path)Download and "install" EclipseStart Eclipse, then install the Google plugin, google's update site contains several plugins, for regular GWT development install the following: Google plugin for eclipse (For Google App Engine), GWT designer for GPE (Drag & drop designer), GAE & GWT SDKs
That's it, you have set your GWT development environment from scratch.


First steps Resources
Official google IO GWT presentationsIntroduction (2007)GWT Roadmap for the Future (2013)Demystifying …

Java DB queries - All of the Options (2014)

When you want to query a DB in java, you have two main options to go:
1. Straight forward queries to the DB This is the most intuitive (imho), just write down the query you want and get the results from the DB.
The most notable downside to this method is that you pollute your code with sql queries which can be all over the code, so there is no "one point of change" to the queries, which makes it very hard to change the DB schema for example.

2. ORM style queries This way requires more work to begin qurying the DB but then you use java objects instead of queries so you keep your code clean.
You also are able to separate your model from your controller thus change your DB schema with much less pain.



Which ORMs exist? Many, while the most notable ones are hibernate (which was the first afaik) & eclipse/total link. But most ORM are not only a third party library (like hibernate when it began) but are now part of JEE implementation of the JPA interface (which itself was inspire…

Best way to send an email using Java

Which email type should I send: Html or Plain text, why? Well, html is nicer (and links are clickable, and font could be bolded etc) but may cause more problems because some email clients might not be able to read it, not to mention smartphones or other devices which might prefer plain text emails.
My suggestion is to send two emails, an html one and a plain text one - bundled together (multipart email) and let the email client decide which it wants to read.

But we still have some hurdles:
Not all email clients read CSS Some firewalls might block html Some firewalls might block images.
So what should I do? - I think the best option is to eliminate as much obstacles as necessary in the html version of the email, and then send it bundled with a plain text email which should cover all cases.

So lets by eliminating obstacles. CSS can't be read by many email clients? - Don't use CSS in your emails. HTML isn't supported? Bundle a plain text email with it. Images don't get past …