@@ -94,7 +94,7 @@ data into a DataFrame object. They can take a number of arguments:
9494  - ``converters ``: a dictionary of functions for converting values in certain
9595    columns, where keys are either integers or column labels
9696  - ``encoding ``: a string representing the encoding to use if the contents are
97-     non-ascii, for python versions prior to 3  
97+     non-ascii
9898  - ``verbose `` : show number of NA values inserted in non-numeric columns
9999
100100.. ipython :: python 
@@ -139,6 +139,67 @@ fragile. Type inference is a pretty big deal. So if a column can be coerced to
139139integer dtype without altering the contents, it will do so. Any non-numeric
140140columns will come through as object dtype as with the rest of pandas objects.
141141
142+ .. _io.fwf :
143+ 
144+ Files with Fixed Width Columns
145+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
146+ While `read_csv ` reads delimited data, the :func: `~pandas.io.parsers.read_fwf `
147+ function works with data files that have known and fixed column widths.
148+ The function parameters to `read_fwf ` are largely the same as `read_csv ` with
149+ two extra parameters:
150+ 
151+   - ``colspecs ``: a list of pairs (tuples), giving the extents of the
152+     fixed-width fields of each line as half-open intervals [from, to[
153+   - ``widths ``: a list of field widths, which can be used instead of
154+     ``colspecs `` if the intervals are contiguous
155+ 
156+ .. ipython :: python 
157+    :suppress: 
158+ 
159+    f =  open (' bar.csv'  , ' w'  ) 
160+    data1 =  (" id8141    360.242940   149.910199   11950.7\n "  
161+             " id1594    444.953632   166.985655   11788.4\n "  
162+             " id1849    364.136849   183.628767   11806.2\n "  
163+             " id1230    413.836124   184.375703   11916.8\n "  
164+             " id1948    502.953953   173.237159   12468.3"  ) 
165+    f.write(data1) 
166+    f.close() 
167+ 
168+  Consider a typical fixed-width data file:
169+ 
170+ .. ipython :: python 
171+ 
172+    print  open (' bar.csv'  ).read() 
173+ 
174+  In order to parse this file into a DataFrame, we simply need to supply the
175+ column specifications to the `read_fwf ` function along with the file name:
176+ 
177+ .. ipython :: python 
178+ 
179+    # Column specifications are a list of half-intervals 
180+    colspecs =  [(0 , 6 ), (8 , 20 ), (21 , 33 ), (34 , 43 )] 
181+    df =  read_fwf(' bar.csv'  , colspecs = colspecs, header = None , index_col = 0 ) 
182+    df 
183+ 
184+  Note how the parser automatically picks column names X.<column number> when
185+ ``header=None `` argument is specified. Alternatively, you can supply just the
186+ column widths for contiguous columns:
187+ 
188+ .. ipython :: python 
189+ 
190+    # Widths are a list of integers 
191+    widths =  [6 , 14 , 13 , 10 ] 
192+    df =  read_fwf(' bar.csv'  , widths = widths, header = None ) 
193+    df 
194+ 
195+  The parser will take care of extra white spaces around the columns
196+ so it's ok to have extra separation between the columns in the file.
197+ 
198+ .. ipython :: python 
199+    :suppress: 
200+ 
201+    os.remove(' bar.csv'  ) 
202+ 
142203 Files with an "implicit" index column
143204~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
144205
@@ -281,7 +342,7 @@ function takes a number of arguments. Only the first is required.
281342  - ``mode `` : Python write mode, default 'w'
282343  - ``sep `` : Field delimiter for the output file (default "'")
283344  - ``encoding ``: a string representing the encoding to use if the contents are
284-     non-ascii, for python versions prior to 3  
345+     non-ascii, for python versions prior to 3
285346
286347Writing a formatted string
287348~~~~~~~~~~~~~~~~~~~~~~~~~~ 
0 commit comments