Wednesday, December 9, 2015

Creating intentional memory leak in Java

In Java you can have impression that you don't have to think about memory management. This is true for majority of cases. But there are limits, because if you create too many objects with mixed sizes too fast, garbage collector will work harder and application will be slow.

Memory can become more fragmented which again force garbage collector to compact heap space and make long pauses or throw "Java.lang.OutOfMemoryError" exception. These long pause times are typically triggered when your Java program attempts to allocate large object, such as an array.

Nowadays, modern VM are very efficient and can deal efficiently with rapid small object creation, but if you hit limit you application will die or becomes unresponsive.

Concept of memory leak is very simple, you introduce memory leaks by maintaining obsolete references to Objects. An obsolete reference is simply a reference that will never be dereferenced again. This is so called "simple memory leak".

There are also "true memory leaks". You introduce this leaks when you create objects that are inaccessible by running code but still stored in memory.

One famous example of true leak is concoction of custom class loader, long running thread with thread local variables preferably inside of application container - mmmmmm, so good! :).
This works because the ThreadLocal keeps a reference to the object, which keeps a reference to its Class, which in turn keeps a reference to its ClassLoader. The ClassLoader, in turn, keeps a reference to all the Classes it has loaded.
With multiple deploys you application will break with totally unexpected permanent generation memory leak exception.

There are many "out of memory" errors. Look here for description if interested: memory leaks
 
But in practise, you will see this tree most often.
  • Java.lang.OutOfMemoryError: Java heap space
    • Heap is full
  • Java.lang.OutOfMemoryError: PermGen space
    • Permanent generation space is full.
  • java.lang.OutOfMemoryError: GC Overhead limit exceeded
    • GC is working way to hard with little or no result.

In this blog post I decided to show how easy is to create memory leak. This come come in handy for code interview, or it can be good example of what _not to do.

All examples are runnable, all you need to do is to clone https://github.com/spookysleeper/codingwithpassion/tree/master/leaks repository and run gradle script.

 

Byte leak


To run this example type: "gradlew runByteTest"

This is demonstration of pretty straight forward memory leak using array list and byte array. Array is growing and each element is holding references to one megabyte byte array. Arrays need be allocated as continuous chunks of memory within heap space, and if memory is fragmented GB is struggling and break in the end with Java.lang.OutOfMemoryError: Java heap space exception.



As you can see from this graph, CG didn't have a chance. It's a massacre!



List leak


To run this example type: "gradlew runListTest"

List leak is similar to previous example. It creates list of BigDecimal objects which are newer dereferenced. Simple and effective.
BigDecimal is chosen only because it is heavier than simple Integer or Float or something.



You can see that this time GC is trying really hard to clean heap, but fails eventually.

 

 

Map key leak


Next leak is bit more sophisticated, but at it's core no different than list leak. This is demonstration what will happen when your implementation of hashCode is bad.
Element will bee added indefinitely and every time reference will remain active.

You can run this example by typing: "gradlew runMapBadKeyTest" or you can type "gradlew runMapGoodKeyTest" to test it with good key.



This time CG is not even trying, maybe because StringBuilder with 100000 elements is so much heavier than BigDecimal and simply doesn't have time to do anything.




Class leak


Permanent generation hold internal representations of java classes among other things (names of classes, methods, Strings...). Simplest way of introducing memory leak in this memory area is to create too many classes. Other more sophisticated example is mentioned earlier in this post as "true memory leak".

To run example type: "gradlew runClassTest"



As you can see, it escalate pretty quickly. Because of this, you don't even get PermGen exception every time you run it, it just break on random thing.





Thanks for reading, hope you like it! :)

Saturday, July 25, 2015

Java 8 Streams


Introduction

Every application create and process collections. In Java until recently if you want to do some "finding" or "grouping" on collections you must code it yourself. It was not very exciting and it is repetitive job in nature. groovy for example offers great tools for transforming and managing collection. Check this link for some great examples. Java 8 borrows some concepts from groovy, but also go one step forward with multi core processing and stream concepts.

In SQL you don't need to implement how to calculate grouping or something else, you just describe your expectation (what you want to have).  Stream API in Java 8 is guided with same philosophy.

What is stream?

Stream is basically a sequence of elements from a source that supports aggregate operations. Let's break this statement:
  • Sequence of elements: Stream provides an interface to a sequenced set of values. Implementation of this interface don't store values, values are calculated on run-time.
  • Source: This is where are values are stored. Collection, arrays, I/O.
  • Aggregate operations: All common SQL-like (group, count, sum) and function programming languages constructions (filter, map, reduce, find, match, sorted).
Streams also have to fundamental characteristics:
  • Pipelining: This allows operation on stream to be chained into large pipeline. 
  • Internal iteration: Collections are iterated externally (explicit iteration), stream do the iteration behind the scenes.
Streams are not collections! In a nutshell, collections are about data and streams are about computations. The difference between collections and streams has to do with when things are computed. Every element in the collection has to be computed before it can be added to the collection. In contrast, a stream is a conceptually fixed data structure in which elements are computed on demand. For example in following example no work is actually done until collect is invoked:
List numbers = Arrays.asList(1, 4, 1, 4, 2, 8, 5);
List distinct = numbers.stream().map( i -> i*i).
      distinct().collect(Collectors.toList());
System.out.printf("integers: %s, squares : %s %n", numbers, distinct);
There are two types of stream operations:
  • Intermediate: can be connected together because their return type is a Stream.
  • Terminal: this kind of operation produce a result from a pipeline such as a List, an Integer, or even void (any non-Stream type).
Intermediate operations do not perform any processing until a terminal operation is invoked on the stream pipeline; they are “lazy.”

Streams also use short-circuiting where we need to process only part of the stream, not all of it, to return a result. This is similar to evaluating a large Boolean expression chained with the and operator.

This was just high level overview W/O detailed examples of stream API. It is easy to find examples on other sources like here. In my opinion Stream API is great and refreshing new feature in Java 8, especially with it's lazy, short-circuit multi core features.

Tuesday, July 21, 2015

JavaScript Promises

What are promises?

 

Promises quickly become standard way we handle asynchronous operations in JavaScript. Everybody who code even little bit in JavaScript is familiar with callbacks. Essence of using callback functions in JavaScript passing a function as an argument in another function and later execute that passed-in function or even return it to be executed later.
There are several problems with callback. For example when you need to be sure that two callbacks finishes before you do something, you must introduce new variables to track state of each callback. Callbacks also lead to another problem, which you should be already familiar with: callback hell.

Callback hell

 

I think this all started with node.js and callback hell get a bad rap from the node.js community. This is because when you have your node application with express and mongoose then callbacks are all over the place.
When you need to perform number of actions in specific sequence in JavaScript,  you must use nested functions. Something like this:
asyncCall(function(err, data1){
    if(err) return callback(err);       
    anotherAsyncCall(function(err2, data2){
        if(err2) return calllback(err2);
        oneMoreAsyncCall(function(err3, data3){
            if(err3) return callback(err3);
            // are we done yet?
        });
    });
});
You can use promises to make this code prettier:
asyncCall()
.then(function(data1){
    // do something...
    return anotherAsyncCall();
})
.then(function(data2){
    // do something...  
    return oneMoreAsyncCall();    
})
.then(function(data3){
   // the third and final async response
})
.fail(function(err) {
   // handle any error resulting from any of the above calls    
})
.done();
Lot nicer isn't it?
You can see that instead of requiring a callback we are returning a Promise object. You can chain promises, so subsequent then() calls on the Promise object also return promises.
We don't need to check for error in every callback, but only at the end of promise chain. This is also feature of promises.

Promises are not only solution to callback hell. Some times callback hell is direct consequence of poor code organization. In some cases promises only hide underlying structural problems of code. I mean it you need 5 indention you're screwed anyway, and should fix your program. You can find here some of the hints how to resolve callback hell

Implementation 

 

Promises have arrived natively in JavaScript, but for the end I want to provide half baked promise implementation with comments, so you have feeling how promises are (could be) impelmented:
function Promise(fn) {
  var state = 'pending';
  var value;
  var deferred;
  
  //When function we passed is done, this will be called.
  //If then is called before resolve, then value for then is deffered to function outside promise.
  //If then is called after resolve, then value is readed from internal state.
  function resolve(newValue) {
    value = newValue;
    state = 'resolved';
    
    if(deferred) {
      handle(deferred);
    }
  }

  function handle(onResolved) {
    if(state === 'pending') {
      deferred = onResolved;
      return;
    }

    onResolved(value);
  }
  
  //This will be invoced when client calls it.
  this.then = function(onResolved) {
    handle(onResolved);
  };

  //Executing function that was passed into promise.
  //We are waithing until this function is finished.
  fn(resolve);
}

function testPromise() {
    return new Promise(function(resolve) {
        var value = readFromDatabase();
        resolve(value);
    });
}

testPromise().then(function(databaseValue) {
    log(databaseValue);
});

Monday, September 15, 2014

SOLID object-oriented design

Do you know what SOLID (not solid, but S.O.L.I.D) object-oriented design stand for? It stand for: Single responsibility, Open-closed, Liskov substitution, Interface segregation and Dependency inversion.

This acronym is coined by Robert Martin. According to him, these principles make a backbone of solid object oriented design. You can read more about these principles in his book "Agile Software Development: Principles, Patterns, and Practices". I will try to describe these principles in following posts, but in a timely manner off course. :)

For starters here are his views on bad object oriented design and what should be avoided:
  • Rigidity - It is hard to change because every change affects too many other parts of the system.
  • Fragility - When you make a change, unexpected parts of the system break.
  • Immobility - It is hard to reuse in another application because it cannot be disentangled from the current application.

Sunday, September 14, 2014

Java built-in profiling and monitoring tools

Java profiling is very useful technique to find performance bottlenecks and/or to solve complete system failures. Common bug that can occur in any system size in Java are slow service, JVM crashes, hangs, deadlocks, frequent JVM pauses, sudden or persistent high CPU usage or even the dreaded OutOfMemoryError (OOME)

Finding this kind of bugs is like art and you need lot of experience to be good at it. That's why some of programmers specialize in Java profiling. In some cases fining bug is impossible if you don't know how system works. Every Java programmer should know at least what are basic profiling tools, because you can't always pay some external specialist to fix memory leaks or deadlocks for you.

Java comes with built-in tools for profiling and monitoring. Some of these tools are:

jmap

This is internal Java tool and it is not profiling tool as such, but it is very useful. Oracle describes jmap as an application that “prints shared object memory maps or heap memory details of a given process or core file or remote debug server”. And it is exactly that. Most useful option is to print memory histogram report. The resulting report shows us a row for each class type currently on the heap, with their count of allocated instances and total bytes consumed. Using this report you can easily identify memory leaks if you have any.

jstack

JStack is also not profiling tool, but it can help you identify thread deadlocks. The output of "jstack" is very useful for debugging. It shows how many deadlocks exist in this JVM process and stack traces of waiting threads with source code line numbers, if source codes were compile with debug options.

jconsole


JConsole is a graphical monitoring tool to monitor Java Virtual Machine (JVM) and Java applications both on a local or remote machine.  It is using for monitoring and not profiling, so you are better with using VisualVM described bellow.

VisualVM

Another tool currently built into the JVM is VisualVM, described by its creators as “a visual tool integrating several command line JDK tools and lightweight profiling capabilities”. This tool can generate memory graph that will show you how your application is consuming memory through time. VisualVM also provides a sampler and a lightweight profiler. Sampler lets you sample your application periodically for CPU and Memory usage. It’s possible to get statistics similar to those available through jmap, with the additional capability to sample your method calls’ CPU usage. The VisualVM Profiler will give you the same information as the sampler, but rather than sampling your application for information at regular intervals.

For me these built-in tools work quite well, but if you want more specialized and more powerful tools for profiling  you can check: BTrace, EurekaJ and Eclipse Memory Analyzer (MAT).




Friday, September 12, 2014

Transform if else (conditional) with polymorphism

Switch and consecutive if-else statements are not necessary anti-patterns, but you should consider using polymorphism in this situations, especially  if conditional is complex and long.
Sometimes when you refactor your code in this way, you actually learn something new about your data and make your code easier to follow.
Polymorphism also give you advantage when you have same conditional in through your code on several places and for example you want to introduce new branching, which in case of polymorphism will be just new type.

Let see it in example:
public class ConditionalPolymorphism {
    
    //smelly approach
    public int carSpeed(String car) {
        if ("Hyundai".equals(car)) {
            return 180;
        } else if ("Mazda".equals(car)) {
            return 160;
        } else if ("Nissan".equals(car)) {
            return 190;
        } else {
            throw new InvalidParameterException(
                    "Parameter not legal: " + car);
        }
    }

    //polymorphic approach
    public int carSpeed(Car car) {
        return car.speed();
    }

    public static void main(String[] args) {
        ConditionalPolymorphism conditionalPolymorphism = 
                new ConditionalPolymorphism();
        System.out.println(
                conditionalPolymorphism.carSpeed("Hyundai"));
        System.out.println(
                conditionalPolymorphism.carSpeed(new Hyundai()));
    }
}

interface Car {
    int speed();
}

class Hyundai implements Car {

    public int speed() {
        return 180;
    }
}

class Mazda implements Car {

    public int speed() {
        return 160;
    }
}

class Nissan implements Car {

    public int speed() {
        return 190;
    }
}
This is pretty simple and naive example, but you can see basic mechanics behind it. Polymorphic code is more elegant and it is written more in object-oriented manner. Also in this example you can omit parameter check, which will further reduce code complexity.

Saturday, September 6, 2014

Type Erasure and Bridge Methods in Generics

"Generics programming is about abstracting and classifying algorithms and data structures. It's goal is the incremental construction of systematic catalogs of useful, efficient and abstract algorithms and data structures"

-Alexander Stepanov

Generics programming concepts are nothing new. And it is not something Java introduced to the world (Ada, Eiffel and C++ supported generics even before Java did). In 1988 David Musser and Alexander Stepanov introduced and defined this concept.

Java introduced Generics in 2004 (Java 5) and implement it as type erasure. Type erasure consist of following steps:

  • Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods.
  • Insert type casts if necessary to preserve type safety.
  • Generate bridge methods to preserve polymorphism in extended generic types.
You can conclude from this, that generics in java are purely compile time feature. Because of this, generics in Java incur no run-time overhead and it is important to point this out. I suspect they implement Generics as compile time correctness because they were focused on backward compatibility.

Here is simple demonstration of replacing generics type with their bounds. In this case bound is Object. I know there are better ways of copy array to collection (like Collections.addAll method), but this is only for demonstration purpose, so I will stick with it. ;)
public static <T> void array2Coll(T[] a, Collection<T> c) {
  for (T o : a) {
    c.add(o);
  }
}
After type erasure code will look like this:
public static void array2Coll(Object[] a, Collection c) {
  for (Object o : a) {
    c.add(o);
  }
}
As you can see, generics type has been replaced with Object type (it's upper bound). If their bound would be something else (for example <T> extends Comparable), then generics would be replaced by Comparable.

Sometimes compiler create a synthetic method, called a bridge method, as part of the type erasure process. Next examples will explain why and when compiler create this methods.
public class Node<T> {
  private T data;

  public Node(T data) { this.data = data; }
    public void setData(T data) {
      System.out.println("Node.setData");
      this.data = data;
    }
  }

public class MyNode extends Node<Integer> {
  public MyNode(Integer data) { 
    super(data); 
  }

  public void setData(Integer data) {
    System.out.println("MyNode.setData");
    super.setData(data);
  }
}
After type erasure compiler will create one synthetic bridge method for second class:
public class Node {

  private Object data;

  public void setData(Object data) {
    System.out.println("Node.setData");
    this.data = data;
  }
}

public class MyNode extends Node {

  public MyNode(Integer data) { 
    super(data); 
  }

  // synthetic bridge method
  public void setData(Object data) {
    setData((Integer) data);
  }

  public void setData(Integer data) {
    System.out.println(Integer data);
    super.setData(data);
  }
}
In this example bridge method was created because MyNode class was missing setData method for Object parameter. Without this method we wouldn't have proper polymorphic behavior and next example would throw ClassCastException.
  MyNode mn = new MyNode(5);
  Node n = mn;  
  n.setData("Hello"); //throws ClassCastException

Friday, June 13, 2014

groovy Closures memoization

Memoization (if you don't know already) is technique that groovy closure can use and remember the result of its invocation for a specific set of inputs. It to this by internally caching the result.
This is quite common technique to speed up recursive algorithms (Usually you will use maps for this).

Groovy support memoization through it's .memoize() method.
The first invocation is doing actual work and and subsequent invocation is pulling result from cache.
If you run this program, you will see something like this as output:


Test without memoization:
Adding 1 and 3 took 1540 msec with result 4.
Adding 1 and 3 took 1500 msec with result 4.
Adding 1 and 3 took 1501 msec with result 4.
Adding 1 and 3 took 1500 msec with result 4.
Adding 1 and 3 took 1500 msec with result 4.
Test with memoization:
Adding 1 and 3 took 1500 msec with result 4.
Adding 1 and 3 took 0 msec with result 4.
Adding 1 and 3 took 0 msec with result 4.
Adding 1 and 3 took 1 msec with result 4.
Adding 1 and 3 took 0 msec with result 4.


As you can see there is quite a difference between two test. It goes without saying that your closures should return same result for same parameters.
addTwoNumbers = {int a, b ->
    //simulate some lengthy calculation
    Thread.sleep(1500)
    a + b
}
println("Test without memoization:")
def testItWithoutMemoization(a, b) {
    long start = System.currentTimeMillis()
    long result = addTwoNumbers(a, b)

    println("Adding $a and $b took " +
            "${System.currentTimeMillis() - start} " +
            "msec with result $result.")
}

testItWithoutMemoization(1, 3)
testItWithoutMemoization(1, 3)
testItWithoutMemoization(1, 3)
testItWithoutMemoization(1, 3)
testItWithoutMemoization(1, 3)

addTwoNumbersWithMem = addTwoNumbers.memoize()

println("Test with memoization:")
def testItWithMemoization(a, b) {
    long start = System.currentTimeMillis()
    long result = addTwoNumbersWithMem(a, b)

    println("Adding $a and $b took " +
            "${System.currentTimeMillis() - start} " +
            "msec with result $result.")
}

testItWithMemoization(1, 3)
testItWithMemoization(1, 3)
testItWithMemoization(1, 3)
testItWithMemoization(1, 3)
testItWithMemoization(1, 3)

Tuesday, June 10, 2014

Grails - dataBind in Service Layer

Here is quick tip how you can use dataBind grails command outside of grails controller.
import org.codehaus.groovy.grails.web.metaclass.BindDynamicMethod

class DataBinder {

    private static BindDynamicMethod bindDynamicMethod = new BindDynamicMethod()

    /**
     * make the controller bindData method statically available, e.g. for service layer use
     * implemented as closure to allow static import emulating controller layer bindData usage 1:1
     */
    static Closure bindData = { Object[] args ->
        bindDynamicMethod.invoke(args ? args[0] : null, BindDynamicMethod.METHOD_SIGNATURE, args)
    }
}

//usage
class TestBind {
    public void intermediateNotification(Map params) {
        Test test = new Test()
        DataBinder.bindData(test, params)
        test.save()    
    }
}

Friday, June 6, 2014

Android remote logger - send logs to server

I want to show you simple implementation of remote logger for Android. This can be used for sendings logs from your app to server (backend).

public final class Logger {

    public static final String DISPATCHER_REPORT_URI = App.getDispatcherUri() + "/vidLog/report";

    private static final int CAPACITY = 10;
    private static final String APPLICATION_LOG_TAG = "YourApp";
    private static List<String> logs = new Vector<String>(CAPACITY);
   
    public synchronized static void r(String msg) {

        if (logs.size() + 1 > CAPACITY) {
            flushRemote();
        }

        logs.add(buildLog(msg));
    }

    public synchronized static void flushRemote() {
        final String formattedLogs = collectAndFormatLogs();

        Runnable sendLogs = new Runnable() {
            @Override
            public void run() {
                HttpHelper.doPostRequest(YOUR_SERVER_URI, "APP_LOG", formattedLogs);
            }
        };
        new Thread(sendLogs).start();

        logs.clear();
    }

    private static String collectAndFormatLogs() {
        String logsFormatted = "";
        for (String log : logs) {
            logsFormatted += log + "\n";
        }
        return logsFormatted;
    }

    private static SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");

    private static String buildLog(String msg) {
        return sdf.format(new Date()) +
                " REMOTE " +
                APPLICATION_LOG_TAG + " " +
                msg;
    }

}
//Usage
Logger.r("first message");
Logger.r("second message");

//When CAPACITY is exceeded logs will be flushed automatically.
Logger.flushRemote();