{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "More Python - Files, Functions and Modules\n", "===============\n", "\n", "by\n", "\n", "Kaustubh Vaghmare\n", "-------------\n", "(IUCAA, Pune)\n", "\n", "E-mail: kaustubh[at]iucaa[dot]ernet[dot]in" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Handling Files\n", "---------------\n", "\n", "Let us study how to handle files through a simple exercise. The basic approach involves creating file objects in Python and use various methods associated with file objects to handle file I/O.\n", "\n", "* open() function is used to create file object.\n", "* fileObject.read() - reads entire file as one big string.\n", "* fileObject.write() - to write a string in a file.\n", "* fileObject.readlines() - to read each line as an element of a list.\n", "* fileObject.writelines() - to write a set of lines, each one being a string.\n", "* fileObject.close() - to close a file (buffer flush)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Program to \"Double Space\" a File\n", "-----------------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\" \n", "Program to create a double spaced file.\n", "Input: File Name\n", "Output: Modified File with .sp extension\n", "\"\"\"\n", "\n", "import sys # we need this to parse command line arguments.\n", "import os # we need this to check for file's existence\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": true, "input": [ "# Check number of arguments.\n", "if len(sys.argv) == 2:\n", "\tinfile_name = sys.argv[1]\n", "else:\n", "\tprint \"Oops! Incorrect Number of Arguments.\"\n", "\tsys.exit(2)\n", "\n", "# Check if file exists.\n", "if not os.path.isfile(infile_name):\n", "\tprint \"File doesn't exist.\"\n", "\tsys.exit(3)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [] }, { "cell_type": "code", "collapsed": true, "input": [ "# Open the input file.\n", "infile = open(infile_name, \"r\")\n", "\n", "# Open an output file.\n", "outfile = open(infile_name + \".sp\", \"w\")\n", "\n", "# Loop over each line, add new line to each line.\n", "for line in infile.readlines():\n", "\tline = line+\"\\n\"\n", "\toutfile.write(line)\n", "\n", "outfile.close()\n", "infile.close()\n" ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Functions\n", "==========\n", "\n", "Blocks of code that perform a specific task. \n", "\n", "In Python, a function is defined using the \"def\" keyword. \n", "\n", "We have already seen examples of functions.\n", "\n", "* float(), dict(), list(), len() etc.\n", "* math - sqrt(), floor(), ceil(), radians(), sin()\n", "* open(), type() etc." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "A Simple Function\n", "-------------------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfun():\n", " print \"Hello World!\"\n", " print \"Nice to see you.\"\n", "\n", "print \"Outside the function.\"" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Outside the function.\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pay attention to how the statements indented one level up are part of the function while the statement indented at the same level is not a part of the function." ] }, { "cell_type": "code", "collapsed": false, "input": [ "myfun() # This is how you call our function." ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Hello World!\n", "Nice to see you.\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Function With One Argument\n", "-------------\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfun(a):\n", " print \"Inside MyFun!\"\n", " print a" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "myfun() # WILL GIVE ERROR." ], "language": "python", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "myfun() takes exactly 1 argument (0 given)", "output_type": "pyerr", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mmyfun\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# WILL GIVE ERROR.\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mTypeError\u001b[0m: myfun() takes exactly 1 argument (0 given)" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "As per function definition, one argument / input is needed. An attempt to call the function with none gives an error. EVEN supplying two arguments is wrong." ] }, { "cell_type": "code", "collapsed": false, "input": [ "myfun(\"An Input\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Inside MyFun!\n", "An Input\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "REMEMBER\n", "------\n", "\n", "Python is a dynamically typed language. The true strength of this lies in the fact that you can also call the above function with a float or integer or list input!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "myfun(5)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Inside MyFun!\n", "5\n" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "myfun( [1,2,3] )" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Inside MyFun!\n", "[1, 2, 3]\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Functions that \"return\" something.\n", "----------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def add(a,b):\n", " return a+b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "a = add(2,3)\n", "print a" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "5\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A function that does not have a return statement returns by default something called \"None\"." ] }, { "cell_type": "code", "collapsed": false, "input": [ "b = myfun(\"Hello\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Inside MyFun!\n", "Hello\n" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "print b" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "None\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Functions can return more than one value at a time!\n", "------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def sumprod(a,b):\n", " return a+b, a*b" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "s, p = sumprod(2,3)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Well, technically - Python is returning only one object but that one object is a tuple - in the above case - (2,3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Optional Arguments\n", "---------------\n", "\n", "\"I want a function to assume some values for some arguments when I don't provide them!\" Let's see how this is achieved." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfun(message = \"Default Message\"):\n", " print message" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "myfun(\"Hello World\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Hello World\n" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "myfun()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Default Message\n" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Functions with Arbitrary Number of Arguments\n", "-------------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def sumitall(*values):\n", " total = 0\n", " for i in values:\n", " total += i\n", " return total" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall(2,3,4,5)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "14" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall(2,3,4)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "9" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Mixture of Arguments\n", "---------" ] }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ "0" ] } ], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "def sumitall2(val1, *values):\n", " total = val1\n", " for i in values:\n", " total += i\n", " return total" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall2(2)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "2" ] } ], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall2(2,3,4)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "9" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "sumitall2() # WILL GIVE AN ERROR." ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "ename": "TypeError", "evalue": "sumitall2() takes at least 1 argument (0 given)", "output_type": "pyerr", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0msumitall2\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# WILL GIVE AN ERROR.\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mTypeError\u001b[0m: sumitall2() takes at least 1 argument (0 given)" ] } ], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This way, you can design functions the way you want by imposing both a minimum number of arguments and have flexibility of an arbitary number of them!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Functions are Objects\n", "---------------\n", "\n", "Like lists, dictionaries, ints, floats, strings etc you can pass functions to other functions since they are just objects." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfun(message):\n", " print message" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "def do(f, arg):\n", " f(arg)\n", " \n", "do(myfun, \"Something\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Something\n" ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "x = myfun # simple variable assignment\n", "x(\"Hilo!\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Hilo!\n" ] } ], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Function Documentation\n", "--------------\n", "\n", "Recall using help(math.hypot) to get help on understanding how to use hypot() function. Can we design a function myfun() and ensure that help(myfun) also gives a nice \"help\" output?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfun(a,b):\n", " \"\"\"\n", " Input: Two Objects\n", " Output: Sum of the two input objects.\n", " \"\"\"\n", " return a+b" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "help(myfun)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Help on function myfun in module __main__:\n", "\n", "myfun(a, b)\n", " Input: Two Objects\n", " Output: Sum of the two input objects.\n", "\n" ] } ], "prompt_number": 29 }, { "cell_type": "markdown", "metadata": {}, "source": [ "When designing functions of your own, it is always good to document what the function does so that you and others can use it in the future with ease." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Modules\n", "=======\n", "\n", "Modules can be considered as \"namespaces\" which have a collection of objects which which you can use when needed. For example, math modules has 42 objects including two numbers \"e\" and \"pi\" and 40 functions.\n", "\n", "Every program you execute directly is treated as a module with a special name \\_\\_main\\_\\_.\n", "\n", "So, all the variables you define, the functions you create are said to live in the namespace of \\_\\_main\\_\\_.\n", "\n", "When you say the following, you are making the namespace of __math__ available to you.\n", "\n", " import math\n", " \n", "To then access something inside __math__, you say\n", "\n", " math.object" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "So what happens when you \"import\"\n", "---------------\n", "\n", "* Python interpreter searches for math.py in the current directory or the installation directory (in that order) and compiles math.py, if not already compiled.\n", "* Next, it creates a handle of the same name i.e. \"math\" which can be used to access the objects living inside math." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "type(math)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "module" ] } ], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Other way to \"import\"\n", "---------------------\n", "\n", "In the above example, you are accessing objects inside __math__ through the module object that Python created. It is also possible to make these objects become a part of the current namespace.\n", "\n", " from math import *" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from math import *\n", "radians(45) # no math.radians required." ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "0.7853981633974483" ] } ], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "__WARNING: The above method is extremely dangerous! If your program and the module have common objects, the above statement with cause a lot of mix-up!__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A Middle Ground\n", "-------------\n", "\n", "If there is an object you specifically use frequently and would like to make it a part of your main namespace, then,\n", "\n", " from ModuleName import Object" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from math import sin\n", "print sin(1.54)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.999525830605\n" ] } ], "prompt_number": 32 }, { "cell_type": "markdown", "metadata": {}, "source": [ "__NOTE:__ If you import the same module again in the same program, Python does not reload. Use reload(ModuleName) for reloading." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Aliases for a Module\n", "----------\n", "\n", "If you have decided to access a module's objects from its own namespace, you can choose to alias the module with a name. \n", "\n", " import numpy as np\n", " np.array(...)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another example, \n", " \n", " import matplotlib.pyplot as plt\n", " plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The Python Module Ecosystem\n", "==============\n", "\n", "There are three types of modules you will encounter in Python.\n", "\n", "* Built-in Modules (come with any standard installation of Python)\n", "* Third Party Modules (need to be installed separately)\n", "* Your Own Modules (we'll see how to make them soon)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Built-in Modules\n", "----------------\n", "\n", "* sys - contains tools for system arguments, OS information etc.\n", "* os - for handling files, directories, executing external programs\n", "* re - for parsing regular expressions\n", "* datetime - for date and time conversions etc.\n", "* pickle - for object serialization\n", "* csv - for reading CSV tables\n", "\n", "and many many more ... " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Third Party Modules\n", "------\n", "\n", "These need to be installed separately.\n", "\n", "* numpy / scipy - numerical plus scientific computing extensions to Python\n", "* matplotlib - using Python for plots\n", "* mayavi - for animations in 3D\n", "* pandas - for tabular data analysis\n", "* astropy - Python for Astronomers\n", "* scikit-learn - machine learning and classification tools for Python" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Making your Own Modules\n", "------------\n", "\n", "Very simple. Open a file, say, \"MyModule.py\"\n", "\n", "Write code in the file.\n", "\n", "If the file is in the present folder or on the PYTHONPATH, the following will work.\n", "\n", " import MyModule\n", " MyModule.something ...\n", "\n", "* __NOTE 1: __ File name must have extension .py\n", "* __NOTE 2: __ When importing extension must be dropped." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Example Module - Example.py\n", "--------\n", "\n", " \"\"\"\n", " This is a custom module.\n", " Containing some functions for the purpose of demonstration.\n", " \"\"\"\n", " def fun1():\n", " print \"Inside fun1\"\n", " \n", " def fun2():\n", " print \"Inside fun2\"\n", " \n", " pi = 3.14\n", " e = 2.7\n", " \n", " print \"I am a Custom Module\"\n", "\n", "The above code is stored in Example.py. Let's see how to use it.\n", " " ] }, { "cell_type": "code", "collapsed": false, "input": [ "import Example" ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "I am a Custom Module\n" ] } ], "prompt_number": 33 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the message printed by Example.py. This is to illustrate that any output generated by Example.py will appear on the screen." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print Example.pi" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3.14\n" ] } ], "prompt_number": 34 }, { "cell_type": "code", "collapsed": false, "input": [ "Example.fun1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Inside fun1\n" ] } ], "prompt_number": 35 }, { "cell_type": "code", "collapsed": false, "input": [ "help(Example)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Help on module Example:\n", "\n", "NAME\n", " Example\n", "\n", "FILE\n", " /home/tutorial/NCRA2014/Example.py\n", "\n", "DESCRIPTION\n", " This is a custom module.\n", " Containing some functions for the purpose of demonstration.\n", "\n", "FUNCTIONS\n", " fun1()\n", " \n", " fun2()\n", "\n", "DATA\n", " e = 2.7\n", " pi = 3.14\n", "\n", "\n" ] } ], "prompt_number": 36 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the description. It is what you enclosed in the \"docstring\" at the beginning of the module." ] } ], "metadata": {} } ] }