Home > python > Any way to optimize or improve this python combinations/subsets functions?

Any way to optimize or improve this python combinations/subsets functions?

March 2Hits:0
Advertisement

I believe this is essentially the same method as the itertools.combinations function, but is there any way to make this code more more perfect in terms of speed, code size and readability :

def all_subsets(source,size):         index = len(source)         index_sets = [()]         for sz in xrange(size):                 next_list = []                 for s in index_sets:                         si = s[len(s)-1] if len(s) > 0 else -1                         next_list += [s+(i,) for i in xrange(si+1,index)]                 index_sets = next_list         subsets = []         for index_set in index_sets:                 rev = [source[i] for i in index_set]                 subsets.append(rev)         return subsets 

Yields:

>>> Apriori.all_subsets(['c','r','i','s'],2) [['c', 'r'], ['c', 'i'], ['c', 's'], ['r', 'i'], ['r', 's'], ['i', 's']] 

There is probably a way to use generators or functional concepts, hopefully someone can suggest improvements.

Answers

This type of problems lend themselves very well to recursion. A possible implementation, either in list or generator form could be:

def all_subsets(di, i) :
    ret = []
    for j, item in enumerate(di) :
        if i == 1 :
            ret = [(j,) for j in di]
        elif len(di) - j >= i :
            for subset in all_subsets(di[j + 1:], i - 1) :
                ret.append((item,) + subset)
    return ret

def all_subsets_gen(di, i) :
    for j, item in enumerate(di) :
        if i == 1 :
            yield (j,)
        elif len(di) - j >= i :
            for subset in all_subsets(di[j + 1:], i - 1) :
                yield (item,) + subset

>>> all_subsets(range(4), 3)
[(0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3)]
>>> list(all_subsets_gen(range(4), 3))
[(0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3)]

If you are going to run this on large sets, you may want to memoize intermediate results to speed things up.

Related Articles

  • Any way to optimize or improve this python combinations/subsets functions?March 2

    I believe this is essentially the same method as the itertools.combinations function, but is there any way to make this code more more perfect in terms of speed, code size and readability : def all_subsets(source,size): index = len(source) index_sets

  • Improve performance of Gaussian kernel function evaluationFebruary 26

    I need to improve the performance of a function that calculates the integral of a two-dimensional kernel density estimate (obtained using the function stats.gaussian_kde) where the domain of integration is all points that evaluate below a given value

  • How to turn this working A/B optimization program into more pythonic code?November 2

    I have a working solution for the given problem of A/B Optimization, but you as an experienced Python (3) developer will agree with me, that this solution is not very pythonic at all. My Question here is: How do I make this code more pythonic? It's a

  • Areas where I can improve my python code? February 8

    I wrote a program to solve a problem, I am not so good in python so want help to find areas where this program can be improved or simplified and readable too:- Problem:- Professor Hatim is researching the sexual behavior of a rare species of lizard.

  • Improve on python public interface data parserFebruary 11

    I would love some thoughts on this class that a wrote to extract user information from a logfile line. I'm concerned about design patterns and the complexity of the code. The extract_user method feels especially awkward. The class should be used as a

  • Improvements on Python game?January 5

    This is a game I made in Python. While it is not my first, I am not happy with the result. I would like suggestions on ways I can make it better, more user-friendly, and more enjoyable. Also, if you have suggestions on how to get the same effect, but

  • Any way to optimize this already fast Python solution to Project Euler 31 (coin sums)?April 20

    This is a Python solution for all general cases. Any suggestions at all would be appreciated; in my opinion it's already quite good! Here's a link to the problem. from timeit import default_timer as timer def count_coins(target, operands): ways_to_ma

  • Merge sort optimization and improvementJanuary 17

    How to optimize this merge sort code to make it run faster? And how to call merge_sort function without user input by declaring necessary array in the code? #include <iostream> using namespace std; int a[50]; void merge(int,int,int); void merge_sort

  • Optimization of A* in Python February 10

    I have implemented A* in my game. However, I am not happy with the performance and need some help deciding what to do next. I have already done the following optimizations: Not having a closed list (a Node just stores a bool onClosedList()) Not itera

  • Optimize (or improve, at least) PSTricks code for drawing a beehive

    Optimize (or improve, at least) PSTricks code for drawing a beehiveApril 20

    Consider the following example. Code % pdflatex -shell-escape test.tex \documentclass{article} \usepackage{ auto-pst-pdf, pst-eucl } \usepackage{siunitx} \begin{document} \begin{figure} \def\indryk{5} \def\bredde{60} \pstFPdiv\konstA{2}{3} \pstFPmul\

  • Handling signals in Python inside a functionJune 16

    I have code that needs to exit gracefully. To simplify things, I am presenting an example based on the answer given here. Since I need to handle a few signals, I thought of putting this logic in a function: def set_signals(): original_sigint = signal

  • Improve performance by using drush functions instead of Drupal module functionsJune 19

    I am trying to improve the import performance of a drupal website we are working on. The site makes use of the queue api to import individual items. Currently this process is very slow. I have hear it mentioned that one way of improving performance i

  • Speeding up Python date conversion function currently using list comprehension and datetimeJanuary 15

    I am reading a large data file where the time is given in number of days since some epoch. I am currently converting this to Python's datetime format using this function: import datetime as dt def days2dt(days_since_epoch): epoch = dt.datetime(1980,

  • Functional programming in Python: 2048 merge functionsMay 21

    I've written the core functions for a basic 2048 game. I know Python isn't a functional programming language, but I like the "functional style" (easier to maintain, simple to understand, more elegant), and have tried to use this rather than iter

  • Python Decorator - inspecting function argument valuesJune 29

    Some of my functions use a "fail_silently" flag. It is used in the following way: def test(a, b, c=1, fail_silently = False) try: return int(a) + c except: if fail_silently: return None raise Therefore, if there is an error, we catch it and fail

  • Import Python multiprocessing main function in another Python file as a moduleJanuary 16

    I am a physicist and I am new to Python just recently migrated from MATLAB. I need to implement multiprocessing to perform calculations for a genetic optimization algorithm. I have managed to make multiprocessing work in single python file using info

  • Opencv python: apply arbitrary function to each pixel without iteratingJanuary 28

    Is there a way in opencv with python to apply an arbitrary function to each pixel in an image while taking advantage of numpy's optimizations? For example, I'd like to get the sigmoid of a single-channel image: out_image[:,:] = 1 / (1 + math.exp(-in_

  • Python combine two integers to a thirdFebruary 16

    We have three integers: A, B, C where the first (i.e. the most significant) digit of C is the first digit of A, the second digit of C is the first digit of B, the third digit of C is the second digit of A, the fourth digit of C is the second digit of

  • Python: Combine multiple ShapefilesJune 15

    i have got a lot of Point-Shapefiles (r20010101.shp, r20010202.shp, r20020101.shp...). Each of them has the same amount of points at exactly the same location. In a column that is named after the Date e.g. "D20010101", where i have got unique Va

  • Suggestions for improving collection of related PHP functions?July 11

    I am working on a Web chat application. In the app, users can run commands (such as /nick, /msg, etc) by starting with a / and then typing the command followed by parameters. For example, if someone wanted to change their username to "foobar", t

Copyright (C) 2018 ceus-now.com, All Rights Reserved. webmaster#ceus-now.com 14 q. 0.956 s.