i have data.frame single column, vector of strings.

these strings have duplicate values. want find character strings have duplicates in vector , write index of position in new column.

so example consider have:

dt<- data.frame(string=a,b,c,d,e,f,a,c,f,z,a) 

i want get:

string match2 match2 match3 matchx....      1       7      11 b      2       na     na c      3       8      na d      4       na     na e      5       na     na f      6       9      na      1       7      11 c      3       8      na f      6       9      na z      10      na     na      1       7      11 

the string ways longer in example , not know amount of maximum columns need.

what effective way this? know there duplicate function not sure how combine result want here.

many thanks!

here 1 option data.table. after grouping 'string', sequence (seq_len(.n)) , row index (.i), dcast 'wide' format , join original dataset on 'string'

library(data.table) dcast(setdt(dt)[, .(seq_len(.n),.i), string],string ~ paste0("match", v1))[dt, on = "string"] #     string match1 match2 match3 # 1:           1      7     11 # 2:      b      2     na     na # 3:      c      3      8     na # 4:      d      4     na     na # 5:      e      5     na     na # 6:      f      6      9     na # 7:           1      7     11 # 8:      c      3      8     na # 9:      f      6      9     na #10:      z     10     na     na #11:           1      7     11 

or option split sequence of rows 'string', pad list elements na length less, , merge original dataset (using base r methods)

lst <- split(seq_len(nrow(dt)), dt$string) merge(dt, do.call(rbind, lapply(lst, `length<-`, max(lengths(lst)))),                by.x = "string", by.y = "row.names") 


dt<- data.frame(string=c("a","b","c","d","e","f","a","c",               "f","z","a"), stringsasfactors=false) 


