Why it is a BAD idea to use the attributes of a class outside the class. An example Programming and Algorithms II, Degree in Bioinformatics Ricard Gavalda, Nov 3 2017 ----------- Suppose that our team is building a program to manage the stays of patients in hospitals. We have a class Patient where, to simplify, each patient has these attributes: - id (an identifier string, such as the DNI) - name (a string) - sex ("M" of "F") - birthdate (a string) so that we can create a Patient such as p = Patient("46139872T","Proust, Marcel", "M", "12/09/1976") (Patient may have setter and getter methods for these attributes, and other methods and attributes. This does not matter for this discussion) In the team, you are in charge of creating the class hospital. We want to create a hospital with N rooms like this: h = hospital(N) Rooms will be numbered 0 to N-1. They are initially all free. Then we want to be able to 1. admit a patient to the hospital. S/he will ocupy any free room, provided there is at least one free room. If none exists, the patient cannot be admitted. 2. release (send home) a patient that is in the hospital. The room that s/he ocuppied becomes free. To this effect, you decide that a hospital will have two attributes: - a list "free" of length N where position i contains True to indicate that the room is free, False otherwise. - a list "patients" of pairs (patient,room s/he occupies). So you write the class like: class hospital: def __init__(self,N): self.free = a list containing N True's self.patients = [] def admit(self,p): # admit patient p search for a patient with identifier p.id in self.patients if found: print("the patient is already in. Can't be admitted again") else: find a position r such that self.free[r] == True (a free room) if none found: print("Sorry, full hospital. Can't be admitted") else: self.free[r] = False self.patients.append((p,r)) def release(self,id): find the position i of a patient with identifier id in self.patients if not found: print("this patient is not in the hospital. can't be released"") else: p,r = self.patients[i] # get pair (patient,room) self.free[r] = True # room r is free now self.patients.remove(i) # remove (p,r) from the list Cool! This works! You publish the class for your colleagues in the programming team You give them this little "manual" for the class: "this class represents a hospital. You create a hospital with N rooms by calling hospital(N). You can admit patient p to a hospital h calling h.admit(p). You can release a patient with identifier id calling h.release(id). And if you need anything else, well, you just read and write these the two attributes: - free: a list of length N containg Trues or False. free[i] means room i is free and - patients: a list of pairs (patient,room where s/he is) " Your teammates start using the class for their parts of the project. They know the outside (methods) and the inside (attributes) of the class, what can go wrong? Some days later your teammate Anna needs to know the number of free rooms in the hospital. No problem, she writes the code free_rooms = h.free.count(True) Your teammate Barbara needs to know the proportion of male patients currently in the hospital. No problem, she writes: proportion_males = float(sum(1 for i in [0:len(h.patients)] if h.patients[i](0).sex == "M")) / float(len(h.patients)) (She is a good python programmer; see how she counts number of males in 1 line!) Your teammmate Carles is asked to program a new functionality of the hospital to transfer a patient from his/her current room to another room. He cannot change the class, because you are the owner of the class, so he cannot add a method to the class. So he writes a function that takes the hospital as a regular parameter (no self): def transfer(h,id,new_room): look for the position i of a patient with identifier id in h.patients (p,r) = h.patients[i] # mark room r free, and room new_room not free h.free[r] = True h.free[new_room] = False # change the room in patient list h.patients[i] = (p,new_room) So everybody finds their way around: if they need more functionalities from the hospital, they can just go to the attributes and read from and write to the contents of the hospital. But you one day in the PA2 course you learn that dictionaries are much faster than lists. So you change your list "patients" to be a dictionary. Now, patients with key some id returns the pair (patient with that id, his room), or None if the patient is not there. So both admit and release run in constant time instead of linear in the number of patients. Also, you realize that finding a free room to admit a patient takes time, because you have to look for one position with True going sequentially through the list self.free. You think of another strategy: let the list free contain the numbers of the free rooms. For example if self.free == [10, 26, 15, 19], you know that there are four rooms free, numbers 10, 26, 15, 19. Now finding a free room is immediate, just get the first element of the list (10), and remove it to occupy that room. Even better, get the last one (19) because removing the last element of a list is faster (constant time) than removing the first one (linear time). Freeing up room r is simply self.free.append(r), which is also constant time. You do these changes, send the new version of the class to your teammates, and go on weekend with a sense of work well done. Boy, the new class is so much faster, will they be happy! Instead, you start getting angry emails. Anna says: "hey, what did you do?? my line free_rooms = h.free.count(True) doesn't work any more. It always tells me that there are no free rooms!" (of course: now free contains int's, not True's and False's", so no True is found in h.free). Barbara says: "hey, what did you do?? my line sum(1 for i in [0:len(h.patients)] if h.patients[i](0).sex == "M") doesn't work any more. I worked hard to get that line right. Please undo whatever changes you did." (Of course, h.patients is now a dictionary where the key is a Patient id, not an position i of a list) Carles says: "hey, what did you do?? My function transfer(...) doesn't work any more. When I write h.free[r] = True h.free[new_room] = False # change the room in patient list h.patients[i] = (p,new_room) now the list h.free gets all messed up, and h.patients[i] tells me there is no key i" And so on. Everybody in the team is begging you to fix this "very wrong" version of the class right away, because nothing works. You tell Anna, Barbara, Carles, etc. that you have changed free and patients to have this other structure. That they need to change all their code to adapt to the changes. That they should be happy because the class will be much faster now. They say "Are you crazy??? We've written things like this in dozens of places of our code. It would take weeks for us to change all that, then debug everything and have it working again!!" So you put the original class back. Everybody is stuck with a bad class implementation, which is slow and cannot be adapted to new needs. -------- In an alternate reality: - you don't tell them what the class attributes are. You only tell them the methods. - you add a few more methods anticipating reasonable needs, like a method to know the number of free rooms, one to transfer a patient, and one to get a list of all the patients in the hospital (that would suffice, for example, to get proportion of males). If you later want to change the implementation (attributes, and the methods that use them), it's only *you* that need to do changes, and only inside the class. That is much easier than the team going over all their code and changing dozens of places where they used the attributes in uncontrolled ways. The change is transparent to them. All they will see is that all of a sudden the class is faster! They send you happy messages: "Hey, I don't know what magic you did there, but the new version of hospital really rocks! Thank you!" - if there is enough popular demand for one method that many people will use, you can add it to the class. But it is your job only, and you do it in one and only one way. - if someone has special needs for a method that probably no one else will use, s/he can use inheritance to adapt the class to their needs, like this class my_hospital(hospital): ... properties that hospital does not provide and that I need ... so only these methods need to know the attributes of the class hospital, or can change them. In this class only. You program this once. There are no mentions of the attributes in a dozen places of the outer code. For example, Barbara who needs to know the fraction of males. She can use your method to get a list with all the patients and count there. Linear time in number of patients in the hospital. But she can also create class my_hospital where she adds a new attribute num_males The attribute is created in __init__ and set to 0, (1 line) incremented by 1 when a male patient is admitted, (1 line) decremented by 1 when a male patient is released, (1 line) and accessible from the outside via a simple getter operation def get_frac_males(self): return float(self.num_males) / float(len(self.patients)) 5 lines of code in total, no access to attributes from the outside, and everything remains constant time. Oh, yes, and instead of creating h = hospital(N) she needs to create h = my_hospital(N) everywhere in her code. But that is far easier than reprogramming the computations: No thinking, just use the search-and-replace option of your editor. Modern IDE tools actually can do it for you automatically across many files.