Saturday, March 31, 2018


Break training data into slices for stochastic gradient decent:

import numpy as np
n = 100
training_data = list(range(n))
mini_batch_size = 10
np.random.shuffle(training_data)
mini_batches = [training_data[k:k+mini_batch_size]
    for k in range(0, n, mini_batch_size)]
mini_batches

[[90, 5, 70, 82, 58, 2, 16, 85, 12, 35],
 [14, 54, 62, 39, 96, 73, 60, 80, 33, 89],
 [20, 38, 76, 47, 65, 42, 71, 46, 93, 34],
 [52, 64, 13, 92, 17, 49, 88, 63, 74, 23],
 [43, 25, 10, 97, 48, 68, 95, 81, 24, 31],
 [9, 32, 84, 83, 22, 87, 61, 26, 28, 99],
 [0, 67, 30, 69, 72, 45, 79, 51, 40, 55],
 [6, 15, 75, 66, 29, 3, 18, 77, 98, 21],
 [53, 44, 50, 19, 91, 8, 11, 59, 27, 56],
 [36, 94, 7, 57, 1, 37, 86, 78, 41, 4]]

Thursday, March 29, 2018

Indices in Python list

You may feel uncomfortable with Python indices at the beginning. But it is really convenient if you understand it. You'd love its simplicity actually:


>>> a = list(range(10))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[::2]
[0, 2, 4, 6, 8]
>>> a[1::2]
[1, 3, 5, 7, 9]
>>> a[::-2]
[9, 7, 5, 3, 1]
>>> a[1::-2]
[1]
>>> a[1:8]
[1, 2, 3, 4, 5, 6, 7]
>>> a[1:-2]
[1, 2, 3, 4, 5, 6, 7]
>>> a[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> a[100]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> a[:100]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[7:100]
[7, 8, 9]

Reference:
https://docs.python.org/3/tutorial/introduction.html#strings


Monday, March 26, 2018

Linux library naming conventions

[root@localhost lib]# ls -lrt |grep libodbc.so
-rwxr-xr-x 1 root root 1804447 May 29  2013 libodbc.so.2.0.0
lrwxrwxrwx 1 root root      16 Nov  6  2014 libodbc.so.2 -> libodbc.so.2.0.0
lrwxrwxrwx 1 root root      16 Nov  6  2014 libodbc.so -> libodbc.so.2.0.0

Real name:  libodbc.so.2.0.0
SONAME: libodbc.so.2 
Linker name: libodbc.so

gcc "-lodbc" will seek for libodbc.so(a link or a file).
The depended library is the SONAME: libodbc.so.2


Print SONAME of a shared library:
[root@localhost lib]# objdump -p libodbc.so |grep  'SONAME' |awk -F ' '  '{print $2}'
libodbc.so.2

[root@localhost lib]# readelf -d libodbc.so |grep soname
 0x000000000000000e (SONAME)             Library soname: [libodbc.so.2]

Reference:
Every shared library has a special name called the ``soname''. The soname has the prefix ``lib'', the name of the library, the phrase ``.so'', followed by a period and a version number that is incremented whenever the interface changes (as a special exception, the lowest-level C libraries don't start with ``lib''). A fully-qualified soname includes as a prefix the directory it's in; on a working system a fully-qualified soname is simply a symbolic link to the shared library's ``real name''.
Every shared library also has a ``real name'', which is the filename containing the actual library code. The real name adds to the soname a period, a minor number, another period, and the release number. The last period and release number are optional. The minor number and release number support configuration control by letting you know exactly what version(s) of the library are installed. Note that these numbers might not be the same as the numbers used to describe the library in documentation, although that does make things easier.
In addition, there's the name that the compiler uses when requesting a library, (I'll call it the ``linker name''), which is simply the soname without any version number.

More: 
http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html


makefile: execute a command and grep then awk

CP=cp
LIB_UNIXODBC=/usr/src/tpkgs/unixodbc/2.3.1/linux86w/lib/libodbc.so
RELEASE_DESTDIR=/bld/release/nsr/fb_mssql_linux/linux86w/source


$(CP) $(LIB_UNIXODBC) $(RELEASE_DESTDIR)/ddbda/odbc/$(shell objdump -p $(LIB_UNIXODBC) |grep
'SONAME' |awk -F ' ' '{print $$2}')


which equals to command line:
cp -f /usr/src/tpkgs/unixodbc/2.3.1/linux86w/lib/libodbc.so /bld/release/nsr/fb_mssql_linux/linux86w/source/ddbda/odbc/libodbc.so.2


Note:
1. shell to execute a command in a makefile
2. not like that in bash command line, the grep string is marked with single quotes.
3. there are double '$' in the awk statement.

Monday, March 5, 2018

pandas read_csv from https with Python 3.6.4

On Mac OSX, if you are using Python 3.6 and pandas to try to read a csv file via https:
california_housing_dataframe = pd.read_csv("https://storage.googleapis.com/mledu-datasets/california_housing_train.csv", sep=",")
california_housing_dataframe.describe()
 you may get an error like:
urllib.error.URLError:

To fix this issue:
Open a terminal and take a look at:
/Applications/Python 3.6/Install Certificates.command
Python 3.6 on MacOS uses an embedded version of OpenSSL, which does not use the system certificate store. More details here.
(To be explicit: MacOS users can probably resolve by opening Finder and double clicking Install Certificates.command)
Or read https csv with a workaround:
from io import StringIO

import pandas as pd
import requests


url = "https://storage.googleapis.com/mledu-datasets/california_housing_train.csv"
s = requests.get(url).text
c = pd.read_csv(StringIO(s))
print(c.head())


Sunday, March 4, 2018

numpy.array vs numpy.asarray

Looking at the definition, you'll see the difference between them:
def asarray(a, dtype=None, order=None):
    return array(a, dtype, copy=False, order=order)
The main difference is that array (by default) will make a copy of the object, while asarray will not unless necessary.

The difference can be demonstrated by this example:
  1. generate a matrix
    >>> A = numpy.matrix(np.ones((3,3)))
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
  2. use numpy.array to modify A. Doesn't work because you are modifying a copy
    >>> numpy.array(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
  3. use numpy.asarray to modify A. It worked because you are modifying A itself
    >>> numpy.asarray(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 2.,  2.,  2.]])

Saturday, March 3, 2018

Install Emacs on Mac OSX with brew

$ brew cask install emacs
Reference: https://www.gnu.org/software/emacs/download.html#macos

If you run into this error while access Desktop/Documents/Downlaods directory:


Here is the fix: