Home > other >  Creating a deep copy to prevent value manipulation
Creating a deep copy to prevent value manipulation

Time:07-31

I've found this to be quite a tricky one to solve due to me not understanding some things that would potentially allow me to successfully make this piece of code work as intended. The intention is to create a deep copy of the date inside of the ToDo class such that the date cannot be manipulated (check output). The Date class is not meant to be changed, as the exercise states that all can and should be done inside of the ToDo class. Code is provided below:

public class Date {

    private int [] ymd = new int [3];
    
    public Date (int year, int month, int day) {
        ymd[0]= year;
        ymd[1]= month;
        ymd[2]= day;
    }
    public void setYear (int year) { ymd[0] = year; }
    public void setMonth (int month) { ymd[1] = month; }
    public void setDay (int day) { ymd[2] = day; }
    
    public int getYear () { return ymd[0]; }
    public int getMonth () { return ymd[1]; }
    public int getDay () { return ymd[2]; }
    
    @Override
    public String toString() {
        return ymd[0]   "-"   ymd[1]   "-"   ymd[2];
    }
}

public class ToDo {

        private String what;
        private Date when;

        public ToDo (String what, int year, int month, int day) {
            this.what = what;
            this.when = new Date (year, month, day);
        }
        
        public String getWhat () { return what; }
        
        public Date getWhen () {
            return this.when;
        }

        @Override
        public String toString() {
            return "Todo: "   what   "; date: "   when.toString();
        }       
        
        public static void main (String [] args) {
            ToDo important = new ToDo ("Meeting", 2022, 8, 22);
            String a = important.toString();
            System.out.println(a);

            Date d = important.getWhen();
            d.setMonth(5);
            d.setDay(34); 
            
            System.out.println(important);
      }
}

Output:

Todo: Meeting; date: 2022-8-22

Todo: Meeting; date: 2022-5-34


Note: main method and ToDo constructor (and maybe getWhen method) I feel are of significant interest to determining the root of the issue.

I kind of understand what needs to be done, but of course not how. As you see, the date can be manipulated because I assume a shallow copy is being used here which allows for the date to be changed as it was here. If I understand correctly, the instantiated object 'd' uses 'important' as the reference for which values to use and then change them.

I have attempted quite a few things that are nowhere near simple enough that a beginner would be able to come up with, and I expect that there is definitely some answer that is practically right in front of me, which I cannot see. An assumption I have is that potentially there could be another constructor added for creating a copy properly. Maybe? I don't know.

What about the reference in the variable 'when' being returned from getWhen? Would returning it this way make it an inappropriate way to handle this? If it is inappropriate, would not returning it from getWhen but something else be the better alternative?

What kind of value is in the instance variable 'when', and thus which value is returned from the getWhen method?

A big takeaway from this is that I am missing a fair bit of the pieces that I need to solve this problem due to me lacking understanding in some things. It has been a while since I've last coded as well so I'm still in the process of refreshing my memory of these things that are stored in all of my 3 brain cells, so it will be interesting to see what happens.

CodePudding user response:

I would just create an interface and call it IDate (or something better) and implement this interface through your Date class. Then create another class which will be called ReadOnlyDate and that is what you will return.

The difference between those classes is that one has setters, the other not.

Here is what I mean:

IDate

public interface IDate {
    int getYear ();
    int getMonth ();
    int getDay ();
}

ReadOnlyDate

public class ReadOnlyDate implements IDate {
    private int [] ymd = new int [3];
    
    public ReadOnlyDate(Date d){
        ymd[0]= d.getYear();
        ymd[1]= d.getMonth();
        ymd[2]= d.getDay();
    }
    
    public ReadOnlyDate (int year, int month, int day) {
        ymd[0]= year;
        ymd[1]= month;
        ymd[2]= day;
    }

    public int getYear () {return ymd[0];}
    public int getMonth () {return ymd[1];}
    public int getDay () {return ymd[2];}

    @Override
    public String toString() {
        return ymd[0]   "-"   ymd[1]   "-"   ymd[2];
    }
}

Date

public class Date implements IDate {
    private int [] ymd = new int [3];

    public Date (int year, int month, int day) {
        ymd[0]= year;
        ymd[1]= month;
        ymd[2]= day;
    }
    public void setYear (int year) {ymd[0] = year;}
    public void setMonth (int month) {ymd[1] = month;}
    public void setDay (int day) {ymd[2] = day;}

    public int getYear () {return ymd[0];}
    public int getMonth () {return ymd[1];}
    public int getDay () {return ymd[2];}

    @Override
    public String toString() {
        return ymd[0]   "-"   ymd[1]   "-"   ymd[2];
    }

    public ReadOnlyDate toReadOnlyDate() {
        return new ReadOnlyDate(this);
    }
}

And in your ToDo class, use this getter:

public IDate getWhen () {
    return this.when.toReadOnlyDate();
}

This is one of the many ways to implement this.

CodePudding user response:

Problem

You are describing the following problem statement:

  • prevent a caller from changing the internal Date inside a ToDo object
  • leave Date class as is – no changes allowed

Starting point

Here's something I added to your Date class, just as an informative temporary step:

public int getWhenHashCode() {
    return when.hashCode();
}

With that, it's possible to:

  • create a new ToDo,
  • print the hashCode for the Date object inside,
  • ask for the Date object,
  • print that hashcode, too

This should (and does) print the the same hashCode two times.

This is your starting point – when calling getWhen(), the caller gets the same object that the internal ToDo instance has. From there, if the caller makes any changes to their Date, it changes the other Date, too – there is only one object instance.

ToDo first = new ToDo("first", 1, 2, 3);
System.out.println(first.getWhenHashCode());
Date when = first.getWhen();
System.out.println(when.hashCode());

617901222
617901222

Solution

There's an easy way to solve this – for any calls to when(), instead of returning the actual underlying object to the caller, make a new Date and return that. That way, the caller can make changes to their Date without affecting the internal Date belonging to the ToDo instance.

You can do that by changing your getWhen() method to make a copy instead of returning the object's actual instance, like this:

public Date getWhen() {
//    return this.when;
    return new Date(this.when.getYear(), this.when.getMonth(), this.when.getDay());
}

Running the same test again now prints two different hashCode values because there are two different objects. If the caller makes any changes to the result, it will not change anything about the original Date object inside the instance of ToDo.

617901222
1159190947

In the end, the only necessary change is to the return statement in "getWhen()". The "getWhenHashCode()" method I added is for informational purposes only, doesn't have anything to do with a final solution.

CodePudding user response:

Yes, deep copies would protect against this, but deep copies are incredibly expensive. In all senses of the word: You need to write a ton of code (because java has no non-hacky ways to just deep-copy anything; you have to manually write out clone() methods or copy constructors for all the things, and if you're using third party / java.* classes, generally you have to look up how to clone those or write static utility methods that can do it yourself). That's expensive from a 'your precious time' perspective.

Also, of course, the deep clone op itself is expensive. You're cloning everything, and not even by just blitting entire blocks of memory in one go.

This is a bit of a 'doctor, it hurts when I press here!' problem. The right solution is simply to stop doing it.

The go-to solution: Immutable data structures!

Let's look at String, for example. You can pass a string without (deep) cloning it all day long and you won't run into this problem. Why? No .set methods. Given a string object you cannot change it. At all.. String has no .setCharAt, .replace, .delete, .clear, or any other method that changes the string. Yes, string has e.g. .substring and .toLowerCase(), but this does not change the string. It makes a new string with the requested modification applied to it.

Given that a string cannot change, there is no need to ever clone them. Not even when you want a 'deep copy' of a thing.

Apply that to all relevant data structures and the need to (deep) copy just disappears entirely. There is no longer any need to do it now.

Crucial case in point: You really should delete that Date class you wrote:

  • It clashes with java.util.Date. Generally, naming your types identically to type names in extremely often used core java packages such as java.lang, java.util, and java.io is a really bad idea, as it utterly destroys readability: Anybody reading your code (notably, including yourself, a month down the road when you forgot you made your own type called Date) will initially assume it's java.util.Date. You could simply name it MyDate or ToDoDate or whatnot, but...
  • Why are you reinventing the wheel? Unfortunately, java.util.Date is an oxymoron - a lie. It does not represent dates (it represents a timestamp, which just isn't the same thing). However, java.time.LocalDate does represent precisely what your Date class represents, so just use that. Crucially, j.t.LocalTime is in fact immutable, and therefore, does not need to be copied. You can pass your instance of LocalDate to any caller and they can't 'mess' with you. All operations LocalDate has (such as localDate.addDays(1)) does not modify it. It simply makes a new LocalDate and returns that.

Immutable data structures do bring their own baggage, but this baggage is less severe than mutable data structures. The biggest issue is a 'deep change': Where you have a complicated 'tree' of data; if it's all immutables and you want to change something deep in the data structure, you end up having to write something ridiculous, like:

Movie m = myMovie.withDirector(myMovie.getDirector().withBirthdate(myMovie.getDirector().getBirthdate().withMonth(Month.FEBRUARI)));

One solution to this is the with-by concept: Make lambdas. It would look like this:

Movie m = myMovie.withDirectorBy(d -> d.withBirthdateBy(b -> b.withMonth(Month.FEBRUARI)));

A withXBy method takes a 'converter': A UnaryOperator<T> that takes in a thing and returns a new thing. You have to write all these by hand (or use Project Lombok, though this withby thing is still in a branch and not yet in the latest release unfortunately. SOURCE: I am a core contributor to that project).

Weakening it a tad: Hybrid model.

You can have a todo entry (so, the name, due date, description, creator, etc) be immutable, but have the most top-level stuff (the entire list of todos for example) be mutable. You now do have to clone things, but you can stop when you hit the immutable. It's no longer a complete deep clone: Just clone the list of immutables (the list isn't immutable, but the elements in it are) shallowly and you're done.

This tends to be practical. Whatever common modifications exist, make a method that is like String's .toLowerCase(): It doesn't change the object, it makes a new one with the change applied.

Weakening it some more: "Stop hitting yourself!"

Sometimes the simplest solutions do work. Yes, if you hand out mutable objects to callers without (deep) cloning, they can modify it and this then breaks everything. One really silly solution to it is to document the method with: /** You can mutate it, but.. don't */. If it's your project, well, just listen and apply your own rules. Don't knock it - large swaths of really large and very commonly used projects, such as for example javac (the compiler, itself) have this notion all over the place. The key aspect is to make crystal clear to callers that the thing you give them MUST NOT just be modified, even though you can. The aim is not to stop malicious code (code written by someone who is intending to muck it all up) - simply to avoid bugs (code written by someone who didn't realize they can't modify it, and was just trying to get the job done, accidentally writing a bug). This turns into a bit of a judgement call: You have to walk a mile in the shoes of your API user and determine how likely it is they make a mistake and would modify something in a way that is likely to cause unintended behaviour. For these hobby projects, the only 'API user' tends to be yourself (you're the only one writing for this project), which means the odds that you get confused and forget you aren't supposed to mutate them, is low.

It's most certainly not elegant at all. But it is simple (you don't need to write withBy methods, for example).

  • Related