Formatted input in Python -


i have file has following:

  b c d 1 2 3 4 5 2   2 4 3 1   3 4 

note 4 on line 2 followed new line.

i want make dictionary looks this

['a']['1'] = 2, d['b']['1'] = 3, ..., d['d']['1'] = 5, d['b']['2'] = 2, etc 

the blanks should not appear in dictionary.

what's best way in python?

the data single digits right? lines column headers? in case, can this:

it = iter(datafile) cols = list(next(it)[2::2]) d = {} row in it:     col, val in zip(cols, row[2::2]):         if val != ' ':             d.setdefault(col, {})[row[0]] = int(val) 

based on author's data , code added, above code isn't enough. if format of document 31 pairs of data 12 months in groups of 6, handle in many ways. wrote. it's not elegant, not efficient can be, get's job done. 1 of reasons why index row first, column.

def process(data):     import re     hre = re.compile(r' +([a-z]+)'*6)     sre = re.compile(r' +([a-z]+)  ([a-z]+)'*6)     dre = re.compile(r'(\d{1,2})  ' + r'(.{4}) (.{4}) {,4}'*6)      = iter(data)     headers = none         result = {}      line in it:         if not line: continue         if not headers:             # find first header             hmatch = hre.match(line)             if hmatch:                 subs = iter(sre.match(next(it)).groups())                 headers = [h + next(subs)                     h in hmatch.groups()                     _ in range(2)]                 count = 0         else:             # fill in data             dmatch = dre.match(line)             row = dmatch.group(1)             col, d in zip(headers, dmatch.groups()[1:]):                 if d.strip():                     result.setdefault(col, {})[row] = int(d)             count += 1             if count == 31:                 headers = none      return result 

 

data = """ times of sunrise , sunset (for ideal horizon & meteorological conditions) year 2012 make corrections daylight saving time necessary. ------------------------------------------------------------------------------       jan          feb          mar          apr          may          jun        rise  set    rise  set    rise  set    rise  set    rise  set    rise  set 1  0513 1925    0541 1918    0606 1851    0628 1812    0648 1738    0708 1720 2  0514 1925    0541 1918    0606 1850    0628 1811    0649 1737    0709 1719 3  0515 1925    0542 1917    0607 1849    0629 1810    0649 1736    0709 1719 4  0515 1926    0543 1916    0608 1847    0630 1808    0650 1736    0710 1719 5  0516 1926    0544 1915    0609 1846    0630 1807    0651 1735    0710 1719 6  0517 1926    0545 1915    0609 1845    0631 1806    0651 1734    0711 1719 7  0518 1926    0546 1914    0610 1844    0632 1805    0652 1733    0711 1719  8  0519 1926    0547 1913    0611 1843    0632 1803    0653 1732    0712 1719 9  0519 1926    0548 1912    0612 1841    0633 1802    0653 1731    0712 1718 10  0520 1926    0549 1911    0612 1840    0634 1801    0654 1731    0712 1718 11  0521 1926    0550 1911    0613 1839    0634 1800    0655 1730    0713 1718 12  0522 1926    0551 1910    0614 1838    0635 1759    0655 1729    0713 1718 13  0523 1926    0551 1909    0615 1836    0636 1757    0656 1729    0714 1719 14  0524 1926    0552 1908    0615 1835    0636 1756    0657 1728    0714 1719  15  0525 1925    0553 1907    0616 1834    0637 1755    0657 1727    0714 1719 16  0526 1925    0554 1906    0617 1832    0638 1754    0658 1727    0715 1719 17  0527 1925    0555 1905    0617 1831    0638 1753    0659 1726    0715 1719 18  0527 1925    0556 1904    0618 1830    0639 1752    0659 1725    0715 1719 19  0528 1924    0557 1903    0619 1829    0640 1751    0700 1725    0716 1719 20  0529 1924    0558 1902    0619 1827    0640 1749    0701 1724    0716 1719 21  0530 1924    0558 1901    0620 1826    0641 1748    0701 1724    0716 1720  22  0531 1923    0559 1900    0621 1825    0642 1747    0702 1723    0716 1720 23  0532 1923    0600 1859    0621 1824    0642 1746    0703 1723    0716 1720 24  0533 1923    0601 1858    0622 1822    0643 1745    0703 1722    0717 1720 25  0534 1922    0602 1857    0623 1821    0644 1744    0704 1722    0717 1721 26  0535 1922    0602 1855    0624 1820    0644 1743    0705 1722    0717 1721 27  0536 1921    0603 1854    0624 1818    0645 1742    0705 1721    0717 1721 28  0537 1921    0604 1853    0625 1817    0646 1741    0706 1721    0717 1722  29  0538 1920    0605 1852    0626 1816    0646 1740    0706 1720    0717 1722 30  0539 1920                 0626 1815    0647 1739    0707 1720    0717 1722 31  0540 1919                 0627 1813                 0707 1720            jul          aug          sep          oct          nov          dec        rise  set    rise  set    rise  set    rise  set    rise  set    rise  set 1  0717 1723    0705 1740    0632 1759    0553 1818    0518 1841    0503 1907 2  0717 1723    0704 1741    0631 1800    0552 1819    0517 1842    0503 1908 3  0717 1724    0703 1741    0630 1801    0551 1819    0517 1843    0503 1909 4  0717 1724    0702 1742    0629 1801    0550 1820    0516 1843    0503 1910 5  0717 1724    0701 1743    0627 1802    0548 1821    0515 1844    0503 1911 6  0717 1725    0700 1743    0626 1802    0547 1821    0514 1845    0503 1911 7  0716 1725    0700 1744    0625 1803    0546 1822    0513 1846    0503 1912  8  0716 1726    0659 1745    0624 1804    0545 1823    0513 1847    0503 1913 9  0716 1726    0658 1745    0622 1804    0543 1823    0512 1848    0503 1914 10  0716 1727    0657 1746    0621 1805    0542 1824    0511 1849    0503 1914 11  0716 1727    0656 1746    0620 1805    0541 1825    0511 1850    0503 1915 12  0715 1728    0655 1747    0618 1806    0540 1825    0510 1850    0504 1916 13  0715 1729    0654 1748    0617 1807    0538 1826    0509 1851    0504 1916 14  0715 1729    0653 1748    0616 1807    0537 1827    0509 1852    0504 1917  15  0714 1730    0652 1749    0614 1808    0536 1827    0508 1853    0505 1918 16  0714 1730    0651 1750    0613 1809    0535 1828    0508 1854    0505 1918 17  0713 1731    0650 1750    0612 1809    0534 1829    0507 1855    0505 1919 18  0713 1731    0649 1751    0610 1810    0533 1830    0507 1856    0506 1920 19  0713 1732    0648 1751    0609 1810    0531 1830    0506 1857    0506 1920 20  0712 1733    0647 1752    0608 1811    0530 1831    0506 1858    0507 1921 21  0712 1733    0645 1753    0607 1812    0529 1832    0505 1859    0507 1921  22  0711 1734    0644 1753    0605 1812    0528 1833    0505 1859    0508 1922 23  0711 1734    0643 1754    0604 1813    0527 1834    0505 1900    0508 1922 24  0710 1735    0642 1755    0603 1813    0526 1834    0504 1901    0509 1923 25  0709 1736    0641 1755    0601 1814    0525 1835    0504 1902    0509 1923 26  0709 1736    0640 1756    0600 1815    0524 1836    0504 1903    0510 1923 27  0708 1737    0638 1756    0559 1815    0523 1837    0503 1904    0510 1924 28  0707 1738    0637 1757    0557 1816    0522 1838    0503 1905    0511 1924  29  0707 1738    0636 1758    0556 1817    0521 1838    0503 1906    0512 1924 30  0706 1739    0635 1758    0555 1817    0520 1839    0503 1906    0512 1925 31  0705 1739    0634 1759                 0519 1840                 0513 1925 """.split('\n') 

 

>>> d = process(data) >>> d['decrise']['8'] 503 >>> d {'augset': {'24': 1755, '25': 1755, '26': 1756, '27': 1756, '20': 1752... 

Comments

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -