Keep It Simple Stupid

Tuesday, October 30, 2018

parameter in Kotlin Primary constructor

var/val within constructor declares a property inside the class. When you do not write it, it is simply a parameter passed to the primary constructor, where you can access the parameters within the **init** block or use it initilize other properties. Constructor parameter is never used as a property.

SAP HANA: get max record for a group

1. With Rank node

2. With Aggregation/Join node

Performance:

Rank node wins.

Tuesday, May 29, 2018

2d array in python3

m = 5
n = 3

a = [[0 for x in range(n)] for y in range(m)]

Or a shorter version:
a = [[0]*n for y in range(m)]

Note: shortening this to something like the following does not really work since you end up with 5 copies of the same list, so when you modify one of them, they all change.
a = [[0]*n]*m
print(a)
a[1][2] = 3
print(a)

[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
[[0, 0, 3], [0, 0, 3], [0, 0, 3], [0, 0, 3], [0, 0, 3]]

You can use [0] * n since Python cannot create a reference to the value 0(it's not an object) and this produces [0,0,0]. Then if you pretend you had a variable x = [0,0,0] then

c1 = x * 5
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

c2 = [x] * 5
[[0, 0, 0], [0, 0, 3], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
[[0, 22, 0], [0, 22, 0], [0, 22, 0], [0, 22, 0], [0, 22, 0]]

Thursday, April 19, 2018

download notebooks/training set/test set from Coursera

Go to the home of the coursera-notebook hub
Create a new python notebook
Execute !tar cvfz allfiles.tar.gz * in a cell
Download the archive !

Enjoy!

If the resulting archive is too big and you can't download it

Open the python notebook where you executed last command and execute the following in a cell:

!split -b 200m allfiles.tar.gz allfiles.tar.gz.part.

This will split the archive into 200Mb blocks that you can download without a problem (if there is still a problem reduce the size by changing 200m to a lower value)

Then when you have downloaded all the split files reunite them on your system using the following command line (in a linux environment, or use cmder if you are on Windows):

cat allfiles.tar.gz.part.* > allfiles.tar.gz

PS: This is in fact valid in any Jupyter-notebook hub

There is simpler way. Go to Notebook's file manager, click "New" then "Terminal", boom - you have a full terminal where you can run any commands you want (like tar).

https://github.com/coursera-dl/coursera-dl

Saturday, March 31, 2018

Break training data into slices for stochastic gradient decent:

import numpy as np
n = 100
training_data = list(range(n))
mini_batch_size = 10
np.random.shuffle(training_data)
mini_batches = [training_data[k:k+mini_batch_size]
    for k in range(0, n, mini_batch_size)]
mini_batches

[[90, 5, 70, 82, 58, 2, 16, 85, 12, 35],
[14, 54, 62, 39, 96, 73, 60, 80, 33, 89],
[20, 38, 76, 47, 65, 42, 71, 46, 93, 34],
[52, 64, 13, 92, 17, 49, 88, 63, 74, 23],
[43, 25, 10, 97, 48, 68, 95, 81, 24, 31],
[9, 32, 84, 83, 22, 87, 61, 26, 28, 99],
[0, 67, 30, 69, 72, 45, 79, 51, 40, 55],
[6, 15, 75, 66, 29, 3, 18, 77, 98, 21],
[53, 44, 50, 19, 91, 8, 11, 59, 27, 56],
[36, 94, 7, 57, 1, 37, 86, 78, 41, 4]]

Thursday, March 29, 2018

Indices in Python list

You may feel uncomfortable with Python indices at the beginning. But it is really convenient if you understand it. You'd love its simplicity actually:

>>> a = list(range(10))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[::2]
[0, 2, 4, 6, 8]
>>> a[1::2]
[1, 3, 5, 7, 9]
>>> a[::-2]
[9, 7, 5, 3, 1]
>>> a[1::-2]
[1]
>>> a[1:8]
[1, 2, 3, 4, 5, 6, 7]
>>> a[1:-2]
[1, 2, 3, 4, 5, 6, 7]
>>> a[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> a[100]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> a[:100]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[7:100]
[7, 8, 9]

Reference:

https://docs.python.org/3/tutorial/introduction.html#strings

Monday, March 26, 2018

Linux library naming conventions

[root@localhost lib]# ls -lrt |grep libodbc.so
-rwxr-xr-x 1 root root 1804447 May 29 2013 libodbc.so.2.0.0
lrwxrwxrwx 1 root root 16 Nov 6 2014 libodbc.so.2 -> libodbc.so.2.0.0
lrwxrwxrwx 1 root root 16 Nov 6 2014 libodbc.so -> libodbc.so.2.0.0

Real name: libodbc.so.2.0.0

SONAME: libodbc.so.2

Linker name: libodbc.so

gcc "-lodbc" will seek for libodbc.so(a link or a file).

The depended library is the SONAME: libodbc.so.2

Print SONAME of a shared library:

[root@localhost lib]# objdump -p libodbc.so |grep 'SONAME' |awk -F ' ' '{print $2}'

libodbc.so.2

[root@localhost lib]# readelf -d libodbc.so |grep soname

0x000000000000000e (SONAME) Library soname: [libodbc.so.2]

Reference:

Every shared library has a special name called the ``soname''. The soname has the prefix ``lib'', the name of the library, the phrase ``.so'', followed by a period and a version number that is incremented whenever the interface changes (as a special exception, the lowest-level C libraries don't start with ``lib''). A fully-qualified soname includes as a prefix the directory it's in; on a working system a fully-qualified soname is simply a symbolic link to the shared library's ``real name''.

Every shared library also has a ``real name'', which is the filename containing the actual library code. The real name adds to the soname a period, a minor number, another period, and the release number. The last period and release number are optional. The minor number and release number support configuration control by letting you know exactly what version(s) of the library are installed. Note that these numbers might not be the same as the numbers used to describe the library in documentation, although that does make things easier.

In addition, there's the name that the compiler uses when requesting a library, (I'll call it the ``linker name''), which is simply the soname without any version number.

http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html