I have something like this:
duplicates = ['a','b','c','d']
if duplicates.length > 4
Photo.includes(:tags).where('tags.name IN (?)',duplicates)
.references(:tags).limit(15).each do |f|
returned_array.push(f.id)
end
end
duplicates is an array of tags that were duplicated with other Photo
tags
What I want is to get Photo
which includes all tags from duplicates array, but right now I get every Photo
that include at least one tag from array.
THANKS FOR ANSWERS:
I try them and somethings starts to work but wasn't too clear for me and take some time to execute.
Today I make it creating arrays, compare them, take duplicates which exist in array more than X times and finally have uniq array of photos ids.
CodePudding user response:
The issue as I see it is that you're only doing one join, which means that you have to specify that tags.name
is within the list of duplicates.
You could solve this in two places:
- In the database query
- In you application code
For your example the query is something like "find all records in the photos table which also have a relation to a specific set of records in the tags table". So we need to join the photos table to the tags table, and also specify that the only tags we join are those within the duplicate list.
We can use a inner join for this
select photos.* from photos
inner join tags as d1 on d1.name = 'a' and d1.photo_id = photos.id
inner join tags as d2 on d2.name = 'b' and d2.photo_id = photos.id
inner join tags as d3 on d3.name = 'c' and d3.photo_id = photos.id
inner join tags as d4 on d4.name = 'd' and d4.photo_id = photos.id
In ActiveRecord it seems we can't specify aliases for joins, but we can chain queries, so we can do something like this:
query = Photo
duplicate.each_with_index do |tag, index|
join_name = "d#{index}"
query = query.joins("inner join tags as #{join_name} on #{join_name}.name = '#{tag}' and #{join_name}.photo_id = photos.id")
end
Ugly, but gets the job done. I'm sure there would be a better way using arel
instead - but it demonstrates how to construct a SQL query to find all photos that have a relation to all of the duplicate tags.
The other method is to extent what you have and filter in the application. As you already have the photos that has at least one of the tags, you could just select those which have all the tags.
Photo
.includes(:tags)
.joins(:tags)
.where('tags.name IN (?)',duplicates)
.select do |photo|
(duplicates - photo.tags.map(&:name)).empty?
end
(duplicates - photo.tags.map(&:name)).empty?
takes the duplicates array and removes all occurrences of any item that is also in the photo tags. If this returns an empty array then we know that the tags in the photo had all the duplicate tags as well.
This could have performance issues if the duplicates array is large, since it could potentially return all photos from the database.
CodePudding user response:
If you want to find photos that have all the given tags you just need to apply a GROUP and use HAVING to set a condition on the group:
class Photo
def self.with_tags(*names)
t = Tag.arel_table
joins(:tags)
.where(tags: { name: names })
.group(:id)
.having(t[:id].count.eq(tags.length)) # COUNT(tags.id) = ?
end
end
This is somewhat like a WHERE clause but it applies to the group. Using .gteq
(>=
) instead of .eq
will give you records that can have all the tags in the list but may have more.
A better way to solve this is to use a better domain model that doesn't allow duplicates in the first place:
class Photo < ApplicationRecord
has_many :taggings
has_many :tags, through: :taggings
end
class Tag < ApplicationRecord
has_many :taggings
has_many :photos, through: :taggings
validates :name,
uniqueness: true,
presenece: true
end
class Tagging < ApplicationRecord
belongs_to :photo
belongs_to :tag
validates :tag_id,
uniqueness: { scope: :photo_id }
end
By adding unique indexes on tags.name
and a compound index on taggings.tag_id
and taggings.photo_id
duplicates cannot be created.